When the network reaches convergence, The loss has been around 2.0 #47

Hukongtao · 2018-10-11T04:41:27Z

python train.py --is-training --update-mean-var --train-beta-gamma
to run the network, but when the network reaches convergence, The loss has been around 2.0.
How did you get 0.2?
@hellochick
Thank you very much for your reply.

Hukongtao · 2018-10-11T04:41:55Z

And the mIOU is 0.323895

hellochick · 2018-10-13T14:14:40Z

Because you set the flag --update-mean-var, which means that you will update the moving mean and moving variance for batch normalization layer. Such operations need large batch size to update them well.
Can you tell me how large your batch size is?

Hukongtao · 2018-10-14T03:18:27Z

@hellochick I get it.My batch size is 2 because my GPU is limited. So I will try again. Thanks a lot.

Hukongtao · 2018-10-14T03:24:45Z

@hellochick So I should run by python train.py --is-training or python train.py --is-training --train-beta-gamma

ningscapr · 2018-10-14T14:50:41Z

Hi, i have the same situation that the loss is about 2.0. Did you use pretrained model? @Hukongtao

Hukongtao · 2018-10-15T01:45:39Z

@ningscapr I didn't use pretrained model. When I use the pretrained model, the initial loss is 0.2, but the training is done, the loss is about 2.0. I don't know why

waterputty · 2018-10-17T11:52:24Z

@Hukongtao Hi, have you solved your issue? I followed hellochick's advice del the --update-mean-var but found the loss was still around 2.0 after 90k iterations. I use cityscapes dataset and the batchsize is 2.

Hukongtao · 2018-10-18T03:03:44Z

@waterputty No, I didn't

waterputty · 2018-10-18T14:22:05Z

@Hukongtao My latest finding is the loss went down to 1.7 after 150k iterations. I think for the cityscapes it may needs much more iterations than 60k in the code, or the learning rate must be set much bigger than 1e-3 which is also from the original code. But I'm not sure. @hellochick Could you give more advice. Thx a lot.

Hukongtao · 2018-11-09T05:59:01Z

@waterputty Have you solved your issue? I am still confused

DenceChen · 2019-02-22T02:46:34Z

@hellochick can you show your train args? your read.me do not have training introduction

KodamaXYZ · 2019-03-01T15:09:31Z

@DenceChen i am also having the same problem my training args are :
@waterputty
IMG_MEAN = np.array((103.939, 116.779, 123.68), dtype=np.float32)

BATCH_SIZE = 1
DATA_DIRECTORY = './data/cityscapes_dataset/cityscape/'
DATA_LIST_PATH = './list/cityscapes_train_list.txt'
IGNORE_LABEL = 255
INPUT_SIZE = '713,713'
LEARNING_RATE = 1e-3
MOMENTUM = 0.9
NUM_CLASSES = 19
NUM_STEPS = 60001
POWER = 0.9
RANDOM_SEED = 1234
WEIGHT_DECAY = 0.0001
RESTORE_FROM = './'
SNAPSHOT_DIR = './model/'
SAVE_NUM_IMAGES = 4
SAVE_PRED_EVERY = 20

i cant set up the Batch for less because it wont run even on a 1080TI gpu with 11gb of memory
after 60k steps i got 2.0 loss
i tried even 150k steps still 1.9 or so.
let me know if you guys succeed

AmeetR · 2019-06-03T08:41:54Z

I'm having the same issue. The best I'm able to get is 1.7 loss and ~.3 mIOU. @hellochick I want to set this as a baseline, so I'd like to get up to the best possible.

zhulf0804 · 2019-06-03T09:17:58Z

@AmeetR I think It's hard to reproduce this repo. I ran it 1 month ago, debugged it, and failed.
Then I code PSPNet myself, and achieved 0.74 mIoU.
Maybe you can try to reproduce PSPNet by yourself.
Good luck.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When the network reaches convergence, The loss has been around 2.0 #47

When the network reaches convergence, The loss has been around 2.0 #47

Hukongtao commented Oct 11, 2018

Hukongtao commented Oct 11, 2018

hellochick commented Oct 13, 2018

Hukongtao commented Oct 14, 2018

Hukongtao commented Oct 14, 2018

ningscapr commented Oct 14, 2018

Hukongtao commented Oct 15, 2018

waterputty commented Oct 17, 2018

Hukongtao commented Oct 18, 2018

waterputty commented Oct 18, 2018

Hukongtao commented Nov 9, 2018

DenceChen commented Feb 22, 2019

KodamaXYZ commented Mar 1, 2019

AmeetR commented Jun 3, 2019

zhulf0804 commented Jun 3, 2019

When the network reaches convergence, The loss has been around 2.0 #47

When the network reaches convergence, The loss has been around 2.0 #47

Comments

Hukongtao commented Oct 11, 2018

Hukongtao commented Oct 11, 2018

hellochick commented Oct 13, 2018

Hukongtao commented Oct 14, 2018

Hukongtao commented Oct 14, 2018

ningscapr commented Oct 14, 2018

Hukongtao commented Oct 15, 2018

waterputty commented Oct 17, 2018

Hukongtao commented Oct 18, 2018

waterputty commented Oct 18, 2018

Hukongtao commented Nov 9, 2018

DenceChen commented Feb 22, 2019

KodamaXYZ commented Mar 1, 2019

AmeetR commented Jun 3, 2019

zhulf0804 commented Jun 3, 2019