Skip to content

705062791/PGBIG

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Progressively-Generating-Better-Initial-Guesses-Towards-Next-Stages-forHigh-Quality-Human-Motion-Prediction

Official implementation of Progressively Generating Better Initial Guesses Towards Next Stages for High-Quality Human Motion Prediction (CVPR 2022 paper)

[PDF] [Supp] [Demo]

Authors

  1. Tiezheng Ma, School of Computer Science and Engineering, South China University of Technology, China, [email protected]
  2. Yongwei Nie, School of Computer Science and Engineering, South China University of Technology, China, [email protected]
  3. Chengjiang Long, Meta Reality Labs, USA, [email protected]
  4. Qing Zhang, School of Computer Science and Engineering, Sun Yat-sen University, China, [email protected]
  5. Guiqing Li, School of Computer Science and Engineering, South China University of Technology, China, [email protected]

Abstract

    This paper presents a high-quality human motion prediction method that accurately predicts future human poses given observed ones. Our method is mainly based on the observation that a good initial guess of the future pose sequence, such as the mean of future poses, is very helpful to improve the forecasting accuracy. This motivates us to design a novel two-stage prediction strategy, including an init-prediction network that just computes a good initial guess and a formal-prediction network that takes both the historical and initial poses to predict the target pose sequence. We extend this idea further and design a multi-stage prediction framework with each stage predicting initial guess for the next stage, which rewards us with significant performance gain. To fulfill the prediction task at each stage, we propose a network comprising Spatial Dense Graph Convolutional Networks (S-DGCN) and Temporal Dense Graph Convolutional Networks (T-DGCN). Sequentially executing the two networks can extract spatiotemporal features over the global receptive field of the whole pose sequence effectively. All the above design choices cooperating together make our method outperform previous approaches by a large margin (6%-7% on Human3.6M, 5%-10% on CMU-MoCap, 13%-16% on 3DPW).

Overview

PGBIG

Dependencies

  • Pytorch 1.8.0+cu11
  • Python 3.7
  • Nvidia RTX 2060

DataSet

Human3.6m in exponential map can be downloaded from here.

CMU mocap was obtained from the repo of ConvSeq2Seq paper.

3DPW from their official website.

Train

  • Train on Human3.6M:

python main_h36m.py --data_dir [dataset path] --kernel_size 10 --dct_n 35 --input_n 10 --output_n 25 --skip_rate 1 --batch_size 16 --test_batch_size 32 --in_features 66 --cuda_idx cuda:0 --d_model 16 --lr_now 0.005 --epoch 50 --test_sample_num -1

  • Train on CMU-MoCap:

python main_cmu_3d.py --data_dir [dataset path] --kernel_size 10 --dct_n 35 --input_n 10 --output_n 25 --skip_rate 1 --batch_size 16 --test_batch_size 32 --in_features 75 --cuda_idx cuda:0 --d_model 16 --lr_now 0.005 --epoch 50 --test_sample_num -1

  • Train on 3DPW:

--data_dir [dataset path] --kernel_size 10 --dct_n 40 --input_n 10 --output_n 30 --skip_rate 1 --batch_size 32 --test_batch_size 32 --in_features 69 --cuda_idx cuda:0 --d_model 16 --lr_now 0.005 --epoch 50 --test_sample_num -1

Note:

  • kernel_size: is the length of used input seqence.

  • d_model: is the latent code dimension of a joint.

  • test_sample_num: is the sample number for test dataset, can be set as {8, 256, -1(all)}. For example, if it is set to 8, it means that 8 samples are sampled for each action as the test set.

After training, the checkpoint is saved in ./checkpoint/.

Test

Add --is_eval after the above training commands.

The test result will be saved in ./checkpoint/.

Citation

If you think our work is helpful to you, please cite our paper.

Ma T, Nie Y, Long C, et al. Progressively Generating Better Initial Guesses Towards Next Stages for High-Quality Human Motion Prediction[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022: 6437-6446.

Acknowledgments

Our code is based on HisRep and LearnTrajDep

Licence

MIT

About

CVPR 2022

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages