-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DCGAN example doesn't work with different image sizes #70
Comments
The error tells you that the number of inputs to the loss function is different than the number of given targets. It happens in the line 209. The problem is that the generator and discriminator architectures are apparently fixed to the default image size (see annotations in the model). Adding a pooling layer at the end of the discriminator, that squeezes every batch element into a |
As @apaszke mentions, the G and D networks are generated with the 64x64 limitations hardcoded. The implementation of the DCGAN here is very similar to the dcgan.torch implementation, and someone else asked about this limitation and got this answer: soumith/dcgan.torch#2 (comment) By following the changes suggested in that comment, you can expand the network for 128x128. So for the generator: class _netG(nn.Module):
def __init__(self, ngpu):
super(_netG, self).__init__()
self.ngpu = ngpu
self.main = nn.Sequential(
# input is Z, going into a convolution
nn.ConvTranspose2d( nz, ngf * 16, 4, 1, 0, bias=False),
nn.BatchNorm2d(ngf * 16),
nn.ReLU(True),
# state size. (ngf*16) x 4 x 4
nn.ConvTranspose2d(ngf * 16, ngf * 8, 4, 2, 1, bias=False),
nn.BatchNorm2d(ngf * 8),
nn.ReLU(True),
# state size. (ngf*8) x 8 x 8
nn.ConvTranspose2d(ngf * 8, ngf * 4, 4, 2, 1, bias=False),
nn.BatchNorm2d(ngf * 4),
nn.ReLU(True),
# state size. (ngf*4) x 16 x 16
nn.ConvTranspose2d(ngf * 4, ngf * 2, 4, 2, 1, bias=False),
nn.BatchNorm2d(ngf * 2),
nn.ReLU(True),
# state size. (ngf*2) x 32 x 32
nn.ConvTranspose2d(ngf * 2, ngf, 4, 2, 1, bias=False),
nn.BatchNorm2d(ngf),
nn.ReLU(True),
# state size. (ngf) x 64 x 64
nn.ConvTranspose2d( ngf, nc, 4, 2, 1, bias=False),
nn.Tanh()
# state size. (nc) x 128 x 128
) And for the discriminator: class _netD(nn.Module):
def __init__(self, ngpu):
super(_netD, self).__init__()
self.ngpu = ngpu
self.main = nn.Sequential(
# input is (nc) x 128 x 128
nn.Conv2d(nc, ndf, 4, stride=2, padding=1, bias=False),
nn.LeakyReLU(0.2, inplace=True),
# state size. (ndf) x 64 x 64
nn.Conv2d(ndf, ndf * 2, 4, stride=2, padding=1, bias=False),
nn.BatchNorm2d(ndf * 2),
nn.LeakyReLU(0.2, inplace=True),
# state size. (ndf*2) x 32 x 32
nn.Conv2d(ndf * 2, ndf * 4, 4, stride=2, padding=1, bias=False),
nn.BatchNorm2d(ndf * 4),
nn.LeakyReLU(0.2, inplace=True),
# state size. (ndf*4) x 16 x 16
nn.Conv2d(ndf * 4, ndf * 8, 4, stride=2, padding=1, bias=False),
nn.BatchNorm2d(ndf * 8),
nn.LeakyReLU(0.2, inplace=True),
# state size. (ndf*8) x 8 x 8
nn.Conv2d(ndf * 8, ndf * 16, 4, stride=2, padding=1, bias=False),
nn.BatchNorm2d(ndf * 16),
nn.LeakyReLU(0.2, inplace=True),
# state size. (ndf*16) x 4 x 4
nn.Conv2d(ndf * 16, 1, 4, stride=1, padding=0, bias=False),
nn.Sigmoid()
# state size. 1
) However, as you can also see in that thread, it is harder to get a stable game between the generator and discriminator for this larger problem. To avoid this, I think you'll have to take a look at the improvements used here https://github.com/openai/improved-gan (paper: https://arxiv.org/abs/1606.03498). This repository includes a model for 128x128 imagenet generation. |
Ahh, thank you for the extra information; that helps immensely, in addition to the intuition for possibly less stable training processes given the larger images. So @bartolsthoorn for the images I'm using--512x512--I probably should look into the improved GAN paper and associated OpenAI implementation? |
@magsol I would suggest to first try your dataset on the standard 64x64. Next you run it on 128x128, with either the extra convolution or pooling layer as listed above. After that you can try 512x512, I am no expert but I have not seen pictures that large generated by a DCGAN. You could also consider generating 128x128 images and then use a separate super-resolution network to reach 512x512. 64x64 and 128x128 are easy to try (the model includes the preprocessing, i.e. the rescaling of the images) and should be easier to generate. Did you already get good results with your data on the 64x64 scale? Please share your experience so far. 😄 |
@bartolsthoorn I ran dcgan with the following arguments:
I tried changing the Unfortunately after very few training iterations, it looks like the mode collapsed: The images from the 24th epoch look like pure static: The real images, on the other hand, look like this: Happy to hear any suggestions you may have :) Thank you so much for your help so far! Learning a lot about GANs! |
Managed to override the default image loader in torchvision so it properly pulls the images in with grayscale format and changed the |
Yes, the learning is unstable. There are some new interesting suggestion in the dcgan.torch thread: soumith/dcgan.torch#2 (comment)
|
Any luck with the above |
Ha! Isn't all of practical ML just tricks :) Haven't had a chance
yet--behind on my teaching. Hoping to get back to this within the next week
though, and will absolutely update when I do.
On Tue, Mar 14, 2017 at 04:40 Lukas Mosser ***@***.***> wrote:
Any luck with the above tricks heuristics?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#70 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAIQ-V7JYksn7-m-IfnFvafXVdcqfQtSks5rlmCYgaJpZM4MDwky>
.
--
iPhone'd
|
@magsol If you happen to run into difficulties training the 512x512 images, you could always scale down the images first to say 64^2 and see if it would even get you the results you'd want, then later scale up? |
Seems like I have one solution to this problem, after the discriminator, the output size changed with the input, when you using loss calculate the input and the target, your label size not changed, so you can let the label size changed with your input size, or you should let the discriminator output the fix size no matter what size data you input . FYI
like your x_A had size [batch_size, 3, 64, 64], after the D , size_feature will be [batch_size, 1], real_label size [batch_size], |
Is there any way to implement a DCGAN that generates rectangular images (ie 128x32) in Pytorch? Nearly every example I've seen works with square images. |
Yes, you can do that @xjdeng, you simply have to ensure that the output of your FC layer has the ratio you want from the final output. That is one way. So in your case, the dense layer output should be (batch_size, channels, 4, 1) or some multiple of that. If your network then consists of transposed convolutions that double in size each layer you would need 5 transposed conv layers to get images of size (batch_size, channels, 128, 32) |
Not sure if this thread is still active, but did anyone try to generate 128x128 images and upscale to 512x512 per @bartolsthoorn 's suggestion? |
DCGAN is quite old. Check the latest papers on GANs and you will find many
large resolution models/examples. You need a dataset with high resolution
images as well (of course).
…On Thu, 21 Mar 2019 at 16:56, Chi Nok Enoch K ***@***.***> wrote:
Not sure if this thread is still active, but did anyone try to generate
128x128 images and upscale to 512x512 per @bartolsthoorn
<https://github.com/bartolsthoorn> 's suggestion?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#70 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAGgWk-LFMMdIzfFfUK6Fg2YBQHumo6yks5vY6vBgaJpZM4MDwky>
.
|
@bartolsthoorn thank you for the reply. I’m pretty new to GAN training, if I have downloaded art images from wiki art and they have different sizes, do I have to somehow preprocess all of them to the same size (eg. 512x512)? What about rectangular images? |
I was able to get dcgan operating successfully at 128x128 by adding the convolutional layers described above and then running with ngf 128 and ndf 32. When I attempted to go to 512 I was not able to get a stable result. I'm attempting to add the white noise to the discriminator to see if that helps. ** I ended up abandoning dcgan and am now over using bmsggan which is a variation on progressive gans. Its handling higher resolutions much better ** |
Hi, I am also trying to implement DCGAN for grayscale image using pytorch. But i got error saying 'RuntimeError: Given groups=1, weight of size 64 1 4 4, expected input[128, 3, 64, 64] to have 1 channels, but got 3 channels instead'. I already set the number of channel as 1 but still got error. do you happen to know where can I fix the problem. |
@nalinzie If you share the code here then it's maybe possible to help. |
https://pytorch.org/tutorials/beginner/dcgan_faces_tutorial.html |
@nalinzie Make sure to check what your dataloader is outputting is also a single-channel image. Line 60 in b9f3b2e
You can also do a print(X.size()) right before you put anything in and out of your model to check what dimensions your tensors are actually. |
Hello, i am also working with example code and trying to get it work with smaller resolution 16x16 images, but it doesnt work with those dimensions. How i need to change generator and discriminator code for DCGAN to work with 16x16 images? |
When I use the D and G code given above for 128 I am getting the following error:
|
Changing the kernel size to 2 and the stride to 4 in the last Conv2D of the Discriminator seems to fix that error, but just want to make sure I'm not crazy here. |
is it working now EvanZ? |
Could you name Some GANS , which could be made to work easily for any input sizes? |
Hello, have you found it? Can you share it? |
no chang7ing, I couldn't find it. If you found ,please share here. |
您好,我已收到!Thank you for your answer sheet. I have received your answer sheet.
|
|
您好,我已收到!Thank you for your answer sheet. I have received your answer sheet.
|
Hi, Michael Powers, have you tried 256x256 with the generated image and does it works well? |
您好,我已收到!Thank you for your answer sheet. I have received your answer sheet.
|
@bartolsthoorn
Do you have any idea about that? |
您好,我已收到!Thank you for your answer sheet. I have received your answer sheet.
|
I'm trying to use this code as a starting point for building GANs from my own image data-- 512x512 grayscale images. If I change any of the default arguments (e.g.
--imageSize 512
) I get the following error:Still learning my way around PyTorch so the network architectures that are spit out before the above message don't yet give me much intuition. I appreciate any pointers you can give!
The text was updated successfully, but these errors were encountered: