Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cuda MAP/Sample on new bit circuits #116

Merged
merged 33 commits into from
Mar 7, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
1628b67
[WIP] Cuda MAP on new bit circuits
khosravipasha Feb 23, 2022
5adc6ad
revert temp fix
khosravipasha Feb 23, 2022
e276ed9
cuda map fixes
khosravipasha Feb 24, 2022
f8a1e3c
fix out of bound issue
khosravipasha Feb 24, 2022
ed6fbd4
preprocess input map, map-ll for faster cuda map
khosravipasha Feb 25, 2022
9139d3e
fix typo
khosravipasha Feb 25, 2022
5156f9c
sample cpu api fix
khosravipasha Feb 25, 2022
a6d345e
fix map cuda downward type issue
khosravipasha Feb 25, 2022
76d59ce
add cuda sampling
khosravipasha Feb 26, 2022
d9f6e09
add comment
khosravipasha Feb 26, 2022
a39c278
add cpu test + more gpu tests likelihoods
khosravipasha Feb 28, 2022
3be43c5
add gpu map tests
khosravipasha Mar 1, 2022
c06aacb
sample tests
khosravipasha Mar 1, 2022
3b5792b
api fixes
khosravipasha Mar 1, 2022
eb10a24
add Random dep to test
khosravipasha Mar 1, 2022
d1acfff
clean up artifcat
khosravipasha Mar 1, 2022
e192b20
doc changes
khosravipasha Mar 1, 2022
cb24911
doc fixes
khosravipasha Mar 1, 2022
93bbc89
doc fixes
khosravipasha Mar 1, 2022
e3c837b
RAT doc
khosravipasha Mar 2, 2022
dba57cd
deploy docs
khosravipasha Mar 2, 2022
f48dd35
docs + RAT api upgrades
khosravipasha Mar 3, 2022
3e6f420
example for rat cat
khosravipasha Mar 4, 2022
4fba431
export necessary functions
liuanji Mar 4, 2022
b8edea2
MNIST training example notebook
liuanji Mar 4, 2022
93e4397
multi input type example
khosravipasha Mar 4, 2022
09114b1
Merge branch 'cudamap' of https://github.com/Juice-jl/ProbabilisticCi…
khosravipasha Mar 4, 2022
637aee2
dist + clearmemory for non inputs
khosravipasha Mar 5, 2022
4dbff22
minimal
khosravipasha Mar 5, 2022
9531cda
update multi input type example
khosravipasha Mar 7, 2022
95eb776
fix deploy docs
khosravipasha Mar 7, 2022
c2e5fbd
update docs project.toml
khosravipasha Mar 7, 2022
7d3fb13
remove docs-heavy
khosravipasha Mar 7, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/deploy_docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,5 +34,5 @@ jobs:
sudo apt install -y pdf2svg texlive-latex-base texlive-binaries texlive-pictures texlive-latex-extra texlive-luatex
luatex -v
pdflatex -v
julia --project=docs/ -e 'using Pkg; Pkg.develop(PackageSpec(name="LogicCircuits")); Pkg.develop(PackageSpec(name="DiscriminativeCircuits")); Pkg.develop(PackageSpec(path=pwd())); Pkg.instantiate();'
julia --project=docs/ -e 'using Pkg; Pkg.develop(PackageSpec(path=pwd())); Pkg.instantiate();'
julia --project=docs/ docs/make.jl
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
*.code-workspace
*.vscode
**checkpoint.ipynb
.ipynb_checkpoints
*Manifest.toml
docs/build/
scratch/
7 changes: 0 additions & 7 deletions Artifacts.toml

This file was deleted.

177 changes: 0 additions & 177 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,183 +8,6 @@

This package provides functionalities for learning/constructing probabilistic circuits and using them to compute various probabilistic queries. It is part of the [Juice package](https://github.com/Juice-jl) (Julia Circuit Empanada).

## Example usage


Assuming that the ProbabilisticCircuits Julia package has been installed with `julia -e 'using Pkg; Pkg.add("ProbabilisticCircuits")'`, we can start using it as follows.

```julia
using ProbabilisticCircuits
```

### Reasoning with manually constructed circuits

We begin by creating three positive literals (boolean variables) and manually construct a probabilistic circuit that encodes a Naive Bayes (NB) distribution with the following form: `Pr(rain, rainbow, wet) = Pr(rain) * Pr(rainbow|rain) * Pr(wet|rain)`.

```julia
rain, rainbow, wet = pos_literals(ProbCircuit, 3)
rain_pos = (0.7 * rainbow + 0.3 * (-rainbow)) * (0.9 * wet + 0.1 * (-wet)) # Pr(rainbow|rain=1) * Pr(wet|rain=1)
rain_neg = (0.2 * rainbow + 0.8 * (-rainbow)) * (0.3 * wet + 0.7 * (-wet)) # Pr(rainbow|rain=0) * Pr(wet|rain=0)
circuit = 0.4 * (rain * rain_pos) + 0.6 * ((-rain) * rain_neg); # Pr(rain, rainbow, wet)
```

Just like any probability distribution, we can evaluate the probabilistic circuit on various inputs. Note that since log probabilities are used in probabilistic circuits for numerical stability, we need to take exponent of the evaluation output to get the probabilities.

```julia
exp(circuit(true, true, true)) # Pr(rain=1, rainbow=1, wet=1)
```

```
0.252f0
```

```julia
exp(circuit(true, false, false)) # Pr(rain=1, rainbow=0, wet=0)
```

```
0.011999999f0
```

From the above examples, we see that it is less likely to rain if we do not see rainbows and the streets are not wet.

The purpose of this package is to offer a unified tool for efficient learning and inference (i.e., answering probabilistic queries such as marginals and MAP) over probabilistic circuits, which subsume a large class of tractable probabilistic models. We first use the above manually constructed circuit to demonstrate several queries that can be answered efficiently. Similar to [logic circuits](https://github.com/Juice-jl/LogicCircuits.jl), answering the following queries require *decomposability* and *determinism*, which is already satisfied by construction:

```julia
isdecomposable(circuit) && isdeterministic(circuit)
```

```
true
```

Decomposability allows us to compute marginal probabilities given partial evidence efficiently (linear time w.r.t. the circuit size). For example, we want to ask the probability of observing rainbows. That is, we want to marginalize out the variables rain and wet. This can be done by evaluating the circuit with partial evidence:

```julia
exp(circuit(missing, true, missing)) # Pr(rainbow=1)
```

```
0.39999998f0
```

Being able to compute marginals immediately offers the ability to compute conditional probabilities. For example, to compute the probability of raining given rainbow=1 and wet=1, we simply take the quotient of Pr(rain=1, rainbow=1, wet=1) and Pr(rainbow=1, wet=1):

```julia
exp(circuit(true, true, true) - circuit(missing, true, true)) # Pr(rain=1|rainbow=1, wet=1)
```

```
0.87500006f0
```

If we are additionally supplied with the structural property *determinism*, we can answer some more advanced queries. For example, we can to compute the maximum a posteriori (MAP) query of the distribution:

```julia
assignments, log_prob = MAP(circuit, [missing, missing, missing])
print("The MAP assignment of the circuit is (rain=$(assignments[1]), rainbow=$(assignments[2]), wet=$(assignments[3])), with probability $(exp(log_prob)).")
```

```
The MAP assignment of the circuit is (rain=false, rainbow=false, wet=false), with probability 0.336.
```

Besides the above examples, ProbabilisticCircuits.jl provides functionalities for a wide variety of queries, which are detailed in [this manual](https://juice-jl.github.io/ProbabilisticCircuits.jl/stable/manual/queries/).

### Building complex circuit structures

ProbabilisticCircuits.jl provides tools to compile classic Probabilistic Graphical Models (PGMs) and Tractable Probabilistic Models (TPMs) into probabilistic circuits efficiently. For example, we can compile a factor graph (FG) into a probabilistic circuit with one line of code:

```julia
fg = fromUAI(zoo_fg_file("asia.uai")) # Load example factor graph
fg_circuit = ProbCircuit(compile_factor_graph(fg)[1]) # Compile the FG to a PC
print("`fg_circuit` contains $(num_edges(fg_circuit)) edges and $(num_parameters(fg_circuit)) parameters.")
```

```
`fg_circuit` contains 2554 edges and 320 parameters.
```

### Learning probabilistic circuits from data

ProbabilisticCircuits.jl offers various parameter learning and structure learning algorithms. It further support mini-batch learning on both CPUs and GPUs, which makes learning large models from large datasets very efficient.

We use the binarized MNIST dataset to demonstrate example probabilistic circuit learning functionalities.

```julia
train_data, valid_data, test_data = twenty_datasets("binarized_mnist");
```

We start with learning the parameters of a *decomposable* and *deterministic* probabilistic circuit. We first load the structure of the circuit from file:

```julia
circuit = zoo_psdd("mnist.psdd")
print("The loaded circuit contains $(num_edges(circuit)) edges and $(num_parameters(circuit)) parameters.")
```

```
The loaded circuit contains 11280 edges and 5364 parameters.
```

```julia
print("Structural properties of the circuit: decomposability: $(isdecomposable(circuit)), determinism: $(isdeterministic(circuit)).")
```

```
Structural properties of the circuit: decomposability: true, determinism: true.
```

Given that the circuit is decomposable and deterministic, the maximum likelihood estimation (MLE) of its parameters is in closed-form. That is, we can learn the MLE parameters deterministically:

```julia
t = @elapsed estimate_parameters!(circuit, train_data; pseudocount = 0.1)
print("Learning the parameters on a CPU took $(t) seconds.")
```

```
Learning the parameters on a CPU took 0.243524592 seconds.
```

Optionally, we can use GPUs to speedup the learning process:

```julia
t = @elapsed estimate_parameters!(circuit, train_data; pseudocount = 0.1)
print("Learning the parameters on a GPU took $(t) seconds.")
```

```
Learning the parameters on a GPU took 0.032219275 seconds.
```

Note that the insignificant speedup is due to the fact that the circuit is too small to make full use of the GPU. For large circuits the speedup could be at least ~10x.

After the learning process, we can evaluate the model on the validation/test dataset. Here we use average log-likelihood per sample as the metric (we again utilize GPUs for efficiency):

```julia
avg_ll = log_likelihood_avg(circuit, test_data)
print("The average test data log-likelihood is $(avg_ll).")
```

```
The average test data log-likelihood is -137.59309172113964.
```

Besides `estimate_parameters`, ProbabilisticCircuits.jl offers iterative parameter learning algorithms such as Expectation-Maximization (EM) (i.e., `estimate_parameters_em!`) and Stochastic Gradient Descent (SGD) (i.e., `estimate_parameters_sgd!`).

ProbabilisticCircuits.jl also offers functionalities for learning the circuit structure and parameters simultaneously. For example, the Strudel structure learning algorithm is implemented natively in the package, and can be used with a few lines of code:

```julia
circuit_strudel = learn_circuit(train_data; maxiter = 100, verbose = false)
avg_ll = log_likelihood_avg(circuit_strudel, test_data)
print("The learned circuit contains $(num_edges(circuit)) edges and $(num_parameters(circuit)) parameters.\n")
print("The average test data log-likelihood is $(avg_ll).")
```

```
The learned circuit contains 11280 edges and 5364 parameters.
The average test data log-likelihood is -134.9860031603151.
```

## Testing

To make sure everything is working correctly, you can run our test suite as follows. The first time you run the tests will trigger a few slow downloads of various test resources.
Expand Down
2 changes: 0 additions & 2 deletions docs-heavy/.gitignore

This file was deleted.

18 changes: 0 additions & 18 deletions docs-heavy/Project.toml

This file was deleted.

32 changes: 0 additions & 32 deletions docs-heavy/Readme.md

This file was deleted.

103 changes: 0 additions & 103 deletions docs-heavy/make.jl

This file was deleted.

30 changes: 0 additions & 30 deletions docs-heavy/manually_build_readme.jl

This file was deleted.

Loading