-
-
Notifications
You must be signed in to change notification settings - Fork 333
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added and updated README files for vision models #218 #305
Changes from 46 commits
e02cfe4
936871c
bd62093
3215e2b
ce19dca
dbba92c
83edc50
0bfef8d
ce52bba
b67e5d2
35996b7
bc29baa
04c0d83
0116bc9
cd686c4
7f8a3da
8a5b7e6
3466de1
20e5bb9
0b3ecdb
f9a42fe
4790b63
69c3897
f1c9c02
48fc214
83571b6
9cc9250
0916e55
d7751ce
228e9c3
fb4aa56
3289e2f
170fefe
64c8ac5
7064bbf
8f0d12d
5de75fa
87e70f1
944a8d3
7d1b846
823b274
8021b8f
b35c585
b7bb761
7930c10
390692f
ebcd810
d7ad415
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
# Conditional DC-GAN | ||
|
||
<img src="../cdcgan_mnist/output/img_for_readme.png" width="440"/> | ||
|
||
[Source](https://arxiv.org/pdf/1411.1784.pdf) | ||
|
||
## Model Info | ||
|
||
Generative Adversarial Networks have two models, a _Generator model G(z)_ and a _Discriminator model D(x)_, in competition with each other. G tries to estimate the distribution of the training data and D tries to estimate the probability that a data sample came from the original training data and not from G. During training, the Generator learns a mapping from a _prior distribution p(z)_ to the _data space G(z)_. The discriminator D(x) produces a probability value of a given x coming from the actual training data. | ||
This model can be modified to include additional inputs, y, on which the models can be conditioned. y can be any type of additional inputs, for example, class labels. _The conditioning can be achieved by simply feeding y to both the Generator — G(z|y) and the Discriminator — D(x|y)_. | ||
|
||
## Training | ||
|
||
```shell | ||
cd vision/cdcgan_mnist | ||
julia --project cGAN_mnist.jl | ||
``` | ||
|
||
## Results | ||
|
||
1000 training step | ||
|
||
![1000 training step](../cdcgan_mnist/output/cgan_steps_001000.png) | ||
|
||
3000 training step | ||
|
||
![30000 trainig step](../cdcgan_mnist/output/cgan_steps_003000.png) | ||
|
||
5000 training step | ||
|
||
![5000 training step](../cdcgan_mnist/output/cgan_steps_005000.png) | ||
|
||
10000 training step | ||
|
||
![10000 training step](../cdcgan_mnist/output/cgan_steps_010000.png) | ||
|
||
11725 training step | ||
|
||
![11725 training step](../cdcgan_mnist/output/cgan_steps_011725.png) | ||
|
||
## References | ||
|
||
* [Mirza, M. and Osindero, S., “Conditional Generative Adversarial Nets”, <i>arXiv e-prints</i>, 2014.](https://arxiv.org/pdf/1411.1784.pdf) | ||
|
||
* [Training a Conditional DC-GAN on CIFAR-10](https://medium.com/@utk.is.here/training-a-conditional-dc-gan-on-cifar-10-fce88395d610) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
# LeNet-5 | ||
|
||
![LeNet-5](../conv_mnist/docs/LeNet-5.png) | ||
|
||
[Source](https://d2l.ai/chapter_convolutional-neural-networks/lenet.html) | ||
|
||
## Model Info | ||
|
||
At a high level LeNet (LeNet-5) consists of two parts: | ||
(i) _a convolutional encoder consisting of two convolutional layers_; | ||
(ii) _a dense block consisting of three fully-connected layers_ | ||
|
||
The basic units in each convolutional block are a convolutional layer, a sigmoid activation function, and a subsequent average pooling operation. Each convolutional layer uses a 5×5 kernel and a sigmoid activation function. These layers map spatially arranged inputs to a number of two-dimensional feature maps, typically increasing the number of channels. The first convolutional layer has 6 output channels, while the second has 16. Each 2×2 pooling operation (stride 2) reduces dimensionality by a factor of 4 via spatial downsampling. The convolutional block emits an output with shape given by (batch size, number of channel, height, width). | ||
|
||
## Training | ||
|
||
```shell | ||
cd vision/conv_mnist | ||
julia --project conv_mnist.jl | ||
``` | ||
|
||
## References | ||
|
||
* [Y. Lecun, L. Bottou, Y. Bengio and P. Haffner, "Gradient-based learning applied to document recognition," in Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, Nov. 1998, doi: 10.1109/5.726791.](http://yann.lecun.com/exdb/publis/pdf/lecun-01a.pdf) | ||
|
||
* [@book | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This doesn't seem to render correctly for me. Can you confirm it's Github compatible? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Looking at the preview it seems like rendering properly. Should I need to change anything? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe that's your local markdown editor, because GitHub renders the raw bibtex. I'd recommend just using the plaintext citation like you have for the reference above. |
||
{zhang2020dive, | ||
title={Dive into Deep Learning}, | ||
author={Aston Zhang and Zachary C. Lipton and Mu Li and Alexander J. Smola}, | ||
note={\url{https://d2l.ai}}, | ||
year={2020} | ||
})](https://d2l.ai/chapter_convolutional-neural-networks/lenet.html) |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -157,4 +157,7 @@ function train(; kws...) | |
end | ||
end | ||
|
||
train() | ||
if abspath(PROGRAM_FILE) == @__FILE__ | ||
train() | ||
end | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
# Deep Convolutional GAN (DC-GAN) | ||
|
||
![dcgan_gen_disc](../dcgan_mnist/output/dcgan_generator_discriminator.png) | ||
[Source](https://gluon.mxnet.io/chapter14_generative-adversarial-networks/dcgan.html) | ||
|
||
## Model Info | ||
|
||
A DC-GAN is a direct extension of the GAN, except that it explicitly uses convolutional and transposed convolutional layers in the discriminator and generator, respectively. The discriminator is made up of strided convolutional layers, batch norm layers, and LeakyReLU activations. The generator is comprised of transposed convolutional layers, batch norm layers, and ReLU activations. | ||
|
||
## Training | ||
|
||
```script | ||
cd vision/dcgan_mnist | ||
julia --project dcgan_mnist.jl | ||
``` | ||
|
||
## Results | ||
|
||
2000 training step | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "steps" should be plural here as well. |
||
|
||
![2000 training steps](../dcgan_mnist/output/dcgan_steps_002000.png) | ||
|
||
5000 training step | ||
|
||
![5000 training steps](../dcgan_mnist/output/dcgan_steps_005000.png) | ||
|
||
8000 training step | ||
|
||
![8000 training steps](../dcgan_mnist/output/dcgan_steps_008000.png) | ||
|
||
9380 training step | ||
|
||
![9380 training step](../dcgan_mnist/output/dcgan_steps_009380.png) | ||
|
||
## References | ||
|
||
* [Radford, A. et al.: Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks, http://arxiv.org/abs/1511.06434, (2015).](https://arxiv.org/pdf/1511.06434v2.pdf) | ||
|
||
* [pytorch.org/tutorials/beginner/dcgan_faces_tutorial](https://pytorch.org/tutorials/beginner/dcgan_faces_tutorial.html) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
# Multilayer Perceptron (MLP) | ||
|
||
![mlp](../mlp_mnist/docs/mlp.svg) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Similar comments about image sourcing and reference formats for this file. |
||
|
||
[Source](http://d2l.ai/chapter_multilayer-perceptrons/mlp.html) | ||
|
||
## Model Info | ||
|
||
An MLP consists of at least three of nodes: an input layer, a hidden layer and an output layer. Except for the input node each node is a neuron that uses a nonlinear activation function. MLP utilizes a supervised learning technique called backpropagation for training. Its multiple layers and non-linear activation distinguish MLP from a linear perceptron. It can distinguish data that is not linearly separable. | ||
|
||
## Training | ||
|
||
```script | ||
cd vision/mlp_mnist | ||
julia --project mlp_mnist.jl | ||
``` | ||
|
||
## Reference | ||
|
||
* [@book | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same comment as above about this just rendering as bibtex. |
||
{zhang2020dive, | ||
title={Dive into Deep Learning}, | ||
author={Aston Zhang and Zachary C. Lipton and Mu Li and Alexander J. Smola}, | ||
note={\url{https://d2l.ai}}, | ||
year={2020} | ||
}](http://d2l.ai/chapter_multilayer-perceptrons/mlp.html) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does this image come from? Does it require attribution?