Repository in still development!!!
This repository is being developed as complement to GroupAD.jl, where the most procedures are located. GenerativeMIL mostly provide advanced models for generative modeling of Multi Instance (Learning) data and Set structured data. Models are implemented in the most optimal which we could think of. So there might be other better way.
Implemented models | CPU training | GPU training | variable cardinality1 (in/out) 2 | note |
---|---|---|---|---|
SetVAE | yes | yes | yes/yes | Implementation is 1:1 Python to Julia code from original repository. |
FoldingNet VAE | yes | yes 3 | yes/no | batched training on CPU via broadcasting / GPU training in special case 3 |
PoolModel (ours) | yes | yes 4 | yes/yes | TODO masked forward pass for variable cardinality on GPU |
SetTransformer | yes | yes | yes/no | classifier version only |
Masked Autoencoder for Distribution Estimation (MADE) | yes | yes | possible5/no | TODO: add support for multiple masks6. |
Masked Autoregressive Flow (MAF) | ? | ? | not finished | |
Inverse Autoregresive Flow (IAF) | ? | ? | not finished | |
SoftPointFlow | ? | ? | yes/yes | not finished |
SetVAEformer (ours) | yes | yes | yes/yes | not finished/ Similar to Vanilla SetVAE but better ;) |
This code base is using the Julia Language and DrWatson to make a reproducible scientific project named
GenerativeMIL
To (locally) reproduce this project, do the following:
- Download this code base. Notice that raw data are typically not included in the git-history and may need to be downloaded independently.
- Open a Julia console and do:
julia> using Pkg julia> Pkg.add("DrWatson") # install globally, for using `quickactivate` julia> Pkg.activate("path/to/this/project") julia> Pkg.instantiate()
This will install all necessary packages for you to be able to run the scripts and everything should work out of the box, including correctly finding local paths.
Footnotes
-
As cardinality, we consider to be the number of elements in a single bag/set. For real world this number in can vary for each set, which makes training in batches impossible. If a model contains a method/way how to bypass this problem, it is considered capable of handling "variable cardinality". Most models require modification to fulfil this such as masking inputs as well as intermediate outputs. ↩
-
"in variable cardinality" is thought as different cardinality of sets in input batch and "out variable cardinality" is whether the model can output batch with distinct cardinalities then in input batch. In other words it can sample arbitrary number of elements for each set. ↩
-
FoldingeNet VAE is trainable on gpu via function "fit_gpu_ready!". It is a special case with fixed cardinality and without KLD of reconstructed encoding. ↩ ↩2
-
At this point PoolModel works only for constant cardinality. ↩
-
Since there is no cardinality reduction or expansion ↩
-
This model is essentially building block for MAF, IAF and SoftPointFlow ↩