Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No log INFO show on terminal when training? #333

Closed
yulong314 opened this issue Dec 4, 2020 · 13 comments
Closed

No log INFO show on terminal when training? #333

yulong314 opened this issue Dec 4, 2020 · 13 comments
Assignees
Labels
question Further information is requested

Comments

@yulong314
Copy link

I used to see train Log info on screen when training , such as "- mmdet - INFO - Epoch [1][50/116] lr: 1.978e-03, eta: 0:34:52, time: 0.765, ....". But I see no info when using mmpose. Is that normal?

@innerlee
Copy link
Contributor

innerlee commented Dec 4, 2020

no

@innerlee
Copy link
Contributor

innerlee commented Dec 4, 2020

Could you share your config and command line?

@yulong314
Copy link
Author

yulong314 commented Dec 4, 2020

my command line:python tools/train.py configs/top_down/darkpose/coco-wholebody/hrnet_w48_coco_wholebody_384x288_dark_plus_12R.py

myconfig file as follows:

log_level = 'INFO'
load_from = None
resume_from = None
dist_params = dict(backend='nccl')
workflow = [('train', 1)]
checkpoint_config = dict(interval=30)
evaluation = dict(interval=10, metric='mAP', key_indicator='AP')

optimizer = dict(
    type='Adam',
    lr=5e-4,
)
optimizer_config = dict(grad_clip=None)
# learning policy
lr_config = dict(
    policy='step',
    # warmup=None,
    warmup='linear',
    warmup_iters=500,
    warmup_ratio=0.001,
    step=[150, 200])
total_epochs = 250
log_config = dict(
    interval=50,
    hooks=[
        dict(type='TextLoggerHook'),
        # dict(type='TensorboardLoggerHook')
    ])

channel_cfg = dict(
    num_output_channels=32,
    dataset_joints=32,
    dataset_channel=[
        list(range(32)),
    ],
    inference_channel=list(range(32)))

# model settings
model = dict(
    type='TopDown',
    pretrained='https://download.openmmlab.com/mmpose/top_down/'
    'hrnet/hrnet_w48_coco_384x288_dark-741844ba_20200812.pth',
    backbone=dict(
        type='HRNet',
        in_channels=3,
        extra=dict(
            stage1=dict(
                num_modules=1,
                num_branches=1,
                block='BOTTLENECK',
                num_blocks=(4, ),
                num_channels=(64, )),
            stage2=dict(
                num_modules=1,
                num_branches=2,
                block='BASIC',
                num_blocks=(4, 4),
                num_channels=(48, 96)),
            stage3=dict(
                num_modules=4,
                num_branches=3,
                block='BASIC',
                num_blocks=(4, 4, 4),
                num_channels=(48, 96, 192)),
            stage4=dict(
                num_modules=3,
                num_branches=4,
                block='BASIC',
                num_blocks=(4, 4, 4, 4),
                num_channels=(48, 96, 192, 384))),
    ),
    keypoint_head=dict(
        type='TopDownSimpleHead',
        in_channels=48,
        out_channels=channel_cfg['num_output_channels'],
        num_deconv_layers=0,
        extra=dict(final_conv_kernel=1, ),
    ),
    train_cfg=dict(),
    test_cfg=dict(
        flip_test=True,
        post_process=True,
        shift_heatmap=True,
        unbiased_decoding=True,
        modulate_kernel=11),
    loss_pose=dict(type='JointsMSELoss', use_target_weight=True))

data_cfg = dict(
    image_size=[288, 384],
    heatmap_size=[72, 96],
    num_output_channels=channel_cfg['num_output_channels'],
    num_joints=channel_cfg['dataset_joints'],
    dataset_channel=channel_cfg['dataset_channel'],
    inference_channel=channel_cfg['inference_channel'],
    soft_nms=False,
    use_nms=False,
    nms_thr=1.0,
    oks_thr=0.9,
    vis_thr=0.2,
    bbox_thr=1.0,
    use_gt_bbox= True,
    image_thr=0.0,
    bbox_file= None,
)

train_pipeline = [
    dict(type='LoadImageFromFile'),
    # dict(type='TopDownRandomFlip', flip_prob=0.5),
    # dict(type='TopDownRandomFlipH', flip_prob=0.5),    
    # dict(
    #     type='TopDownHalfBodyTransform',
    #     num_joints_half_body=8,
    #     prob_half_body=0.3),
    dict(
        type='TopDownGetRandomScaleRotation', rot_factor=40, scale_factor=0.5),
    dict(type='TopDownAffine'),
    dict(type='ToTensor'),
    dict(
        type='NormalizeTensor',
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225]),
    dict(type='TopDownGenerateTarget', sigma=3, unbiased_encoding=True),
    dict(
        type='Collect',
        keys=['img', 'target', 'target_weight'],
        meta_keys=[
            'image_file', 'joints_3d', 'joints_3d_visible', 'center', 'scale',
            'rotation', 'bbox_score', 'flip_pairs'
        ]),
]

val_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(type='TopDownAffine'),
    dict(type='ToTensor'),
    dict(
        type='NormalizeTensor',
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225]),
    dict(
        type='Collect',
        keys=['img'],
        meta_keys=[
            'image_file', 'center', 'scale', 'rotation', 'bbox_score',
            'flip_pairs'
        ]),
]

test_pipeline = val_pipeline

data_root = '/mnt/data/wholebody'
data = dict(
    samples_per_gpu=8,
    workers_per_gpu=1,
    train=dict(
        type='TopDownCocoWeijingDataset',
        ann_file=f'{data_root}/12down_train.json',
        img_prefix=f'{data_root}/train_v3_ok_img/',
        data_cfg=data_cfg,
        pipeline=train_pipeline),
    val=dict(
        type='TopDownCocoWeijingDataset',
        ann_file=f'{data_root}/12down_train.json',
        img_prefix=f'{data_root}/train_v3_ok_img/',
        data_cfg=data_cfg,
        pipeline=train_pipeline),
    test=dict(
        type='TopDownCocoWeijingDataset',
        ann_file=f'{data_root}/12down_train.json',
        img_prefix=f'{data_root}/train_v3_ok_img/',
        data_cfg=data_cfg,
        pipeline=train_pipeline),
)

# load_from = '/home/sy/working/otherCodes/mmpose/work-dirs/flip2500/latest.pth'

@innerlee
Copy link
Contributor

innerlee commented Dec 4, 2020

What's the size of your training set

@yulong314
Copy link
Author

there is evaluation info such as "[INFO ] text:_log_info:122 - Epoch(val) [20][1] AP: 0.1325, AP .5: 0.5574, AP .75: 0.0000, AP (M): -1.0000, AP (L): 0.1325, AR: 0.1900, AR .5: 0.7000, AR .75: 0.0000, AR (M): -1.0000, AR (L): 0.1900
2020-12-04 16:30:39,364 - mmpose - INFO - Epoch(val) [20][1] AP: 0.1325, AP .5: 0.5574, AP .75: 0.0000, AP (M): -1.0000, AP (L): 0.1325, AR: 0.1900, AR .5: 0.7000, AR .75: 0.0000, AR (M): -1.0000, AR (L): 0.1900"

no info about losses,

@yulong314
Copy link
Author

What's the size of your training set

only 30K bytes

@innerlee
Copy link
Contributor

innerlee commented Dec 4, 2020

What's the count number of the training set, is it less than 400 images?

@yulong314
Copy link
Author

What's the count number of the training set, is it less than 400 images?

yes, only 38 images

@yulong314
Copy link
Author

yulong314 commented Dec 4, 2020

I just try to train official cocodataset , there is loss info.
In case my tiny dataset, there is NO loss info

@innerlee
Copy link
Contributor

innerlee commented Dec 4, 2020

The log interval is

interval=50,

, so it will never trigger the logging.

Loss is not printed in evaluation currently. You may raise a feature request here #9

@yulong314
Copy link
Author

The log interval is

interval=50,

, so it will never trigger the logging.

Loss is not printed in evaluation currently. You may raise a feature request here #9

Thanks. by changing interval=1 , i now see loss info

@innerlee
Copy link
Contributor

innerlee commented Dec 4, 2020

It supports printing logs for tiny datasets by changing configs.
"Printing loss in evaluation" was the issue.

@innerlee innerlee added the question Further information is requested label Dec 4, 2020
@jin-s13
Copy link
Collaborator

jin-s13 commented Dec 16, 2020

"Printing loss in evaluation" is listed in our TODO list. I will close this issue for now.

@jin-s13 jin-s13 closed this as completed Dec 16, 2020
rollingman1 pushed a commit to rollingman1/mmpose that referenced this issue Nov 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants