-
Notifications
You must be signed in to change notification settings - Fork 6.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question: monet2photo training loss #30
Comments
I"m not sure how representative this is, but here's my final loss for the discriminators and generators. Once the learning rate started to get lowered at epoch 100, G_B loss slowly increased and G_A seemed to converge in the .35 to .40 range. Both Discriminators stopped oscillating and gradually got lower. Perhaps this will be useful to someone doing the same. I used instance norm. Perhaps I should have used batch norm since I was running batch_size = 8 |
The losses are not so interpretable as G and D are optimizing a minimax game. The plots you posted here looks quite typical to me (except the spike). I will mainly focus on the quality of images. |
@filmo : Have you find the reason why the spike in the loss and how to solve it? |
No, I didn't end up exploring it further. |
Thanks. Let follows the issue how it will be solved |
Fix Quiet Mode
I'm trying to train the monet2photo. My command line was:
python train.py --dataroot ./datasets/monet2photo --name monet2photo --model cycle_gan --gpu_ids 0,1 --batchSize 8 --identity 0.5
The paper discussed using a batch size of 1, but I increased it to 8 to more fully occupy the GPUs. I think this is the only difference between what was described in the paper and my settings, but I may be wrong.
I'm training on two GTX-1070s
I'm about 80 epochs in (~40 hours on my set up) and it seems like I'm oscillating between generated 'photos' that look okay-ish and 'photos' that look pretty 'meh', more like the original painting.
My loss declined pretty rapidly for the first 20 or so epochs, but now seems to be relatively stable with occasional crazy spikes:
I think it's improving slightly with each epoch based on the images and there seems to be a slight downward trend on the loss, but I also might just be kidding myself because I've been staring at it for a while. In other words, I'm not certain that what it's generating a epoch 80 is really that much better than epoch 30. Here's the most recent detailed loss curve.
Question: Is this expected behavior (more or less) or should I be concerned that I've plateaued and/or used the wrong settings. At 100 epochs the learning rate is set to start decreasing based on the default settings. Given that it's taking about 30 minutes per epoch and thus about 61 more hours to complete 200 epochs, I'm wondering if I should "keep on going" or "abort" and fix some settings.
The text was updated successfully, but these errors were encountered: