The moving mean and variance are not updated in BatchNorm when using ParallelDo Op. #9386

qingqing01 · 2018-03-26T14:23:04Z

背景及问题：
models 图像分类、MobileNet-SSD检测、OCR识别任务都遇到问题：使用了ParallelDo Op训练单机多卡的模型，遇到train集的cost收敛，但test集的评估结果完全错误。
原因：
单机多卡时BatchNorm里的moving mean/variance，也就是: https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/layers/nn.py#L1517 这里的参数没有更新，导致test集使用的moving mean/var 还是一开始随机初始化(或加载的初始化模型)的参数。

Debug方法：
分别在【不使用】ParallelDo和【使用】ParalellDo时，打印batch_norm_xx.w_1/2参数，可发现moving mean/var是否变化。比如在我在MobileNet-SSD任务里，打印代码如下：

def test(pass_id):
    map_eval.reset(exe)
    test_map = None
    for _, data in enumerate(test_reader()):
        m1, test_map, loss_v = exe.run(test_program,
                           feed=feeder.feed(data),
                           fetch_list=[map, accum_map, loss])
    print("Test {0}, map {1}, loss {2}".format(pass_id, test_map[0], loss_v[0]))
    t = fluid.global_scope().find_var('batch_norm_34.w_1').get_tensor()
    t = np.array(t).astype(np.float32).flatten()
    print('batch_norm_34.w_1', t[0:20])

The text was updated successfully, but these errors were encountered:

qingqing01 added the 烫 label Mar 26, 2018

This was referenced Mar 26, 2018

Using inference_program meets problem #9334

Closed

Make the first device share data with the global scope in parallel_do_op. #9398

Merged

qingqing01 closed this as completed in #9398 Mar 27, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The moving mean and variance are not updated in BatchNorm when using ParallelDo Op. #9386

The moving mean and variance are not updated in BatchNorm when using ParallelDo Op. #9386

qingqing01 commented Mar 26, 2018 •

edited

Loading

The moving mean and variance are not updated in BatchNorm when using ParallelDo Op. #9386

The moving mean and variance are not updated in BatchNorm when using ParallelDo Op. #9386

Comments

qingqing01 commented Mar 26, 2018 • edited Loading

qingqing01 commented Mar 26, 2018 •

edited

Loading