This is the official implementation for the paper, "Efficient Virtual View Selection for 3D Hand Pose Estimation", AAAI 2022.
We upload prediction results in pixel coordinates (i.e., UVD format) for NYU and ICVL datasets: https://github.com/iscas3dv/handpose-virtualview/tree/main/result_nyu_icvl, Evaluation code (https://github.com/xinghaochen/awesome-hand-pose-estimation/tree/master/evaluation) can be applied for performance comparision among SoTA methods.
The models were damaged during uploading to Google drive. We have uploaded new models.
Modify the training method of View selection with the "student" confidence network.
CUDA 11.1
Other versions of CUDA
should also work,
but please make sure that the version of CUDA
used by PyTorch
is the same as the system,
because our code needs to be compiled with nvcc
.
- Clone this repository.
- Install the required packages:
pip install -r requirements.txt
- Compile and install the multi-view rendering code:
cd ops/cuda/ python setup.py install
We publish training and evaluation code on NYU hand pose dataset and ICVL hand posture dataset. The data preparation process of these two datasets is as follows.
- Download NYU Hand Pose Dataset. Then put them under the data directory:
-directory/ -test/ -joint_data.mat ... -train/ -joint_data.mat ...
- Modify
path
field of theconfig/dataset/nyu.json
to point to the data directory.
- Download ICVL Hand Posture Dataset. Then put them under the data directory:
-directory/ -Testing/ -Depth/ ... -test_seq_1.txt -test_seq_2.txt -Training/ -Depth/ ... -labels.txt
- Modify
path
field of theconfig/dataset/icvl.json
to point to the data directory.
We have already trained some models that you can download and evaluate.
After downloading models, extract it to checkpoint
folder in the project directory.
In the output results, error_3d_conf
shows the average joint error for fusion with confidence,
and error_3d_fused
shows the average joint error for fusion without confidence.
- Uniformly sampling 25 views:
python train_a2j.py --config config/nyu/eval_uniform25.yaml
- Uniformly sampling 15 views:
python train_a2j.py --config config/nyu/eval_uniform15.yaml
- Uniformly sampling 9 views:
python train_a2j.py --config config/nyu/eval_uniform9.yaml
- Uniformly sampling 3 views:
python train_a2j.py --config config/nyu/eval_uniform3.yaml
- Uniformly sampling 1 views:
python train_a2j.py --config config/nyu/eval_uniform1.yaml
In the output results, error_3d_fused
shows the average joint error.
- Select 15 views from 25 views:
python train_a2j.py --config config/nyu/eval_25select15.yaml
- Select 9 views from 25 views:
python train_a2j.py --config config/nyu/eval_25select9.yaml
- Select 3 views from 25 views:
python train_a2j.py --config config/nyu/eval_25select3.yaml
- Select 1 view from 25 views:
python train_a2j.py --config config/nyu/eval_25select1.yaml
In the output results, epoch_error_3d_conf_select
shows the average joint error.
- Select 15 views from 25 views:
python view_select_a2j.py --config config/nyu/eval_25select15_light.yaml
- Select 9 views from 25 views:
python view_select_a2j.py --config config/nyu/eval_25select9_light.yaml
- Select 3 views from 25 views:
python view_select_a2j.py --config config/nyu/eval_25select3_light.yaml
- Select 1 view from 25 views:
python view_select_a2j.py --config config/nyu/eval_25select1_light.yaml
We provide a model trained and configuration files on ICVL hand posture dataset, you can follow the commands on NYU hand pose dataset and use corresponding configuration files to evaluate.
You can also train models using the following commands.
We only train a model that uniformly samples 25 views, which is also suitable for uniformly sampling 15, 9, 3 and 1 views.
python train_a2j.py --config config/nyu/train_uniform.yaml
The following commands train models using the "teacher" network to select 15, 9, 3 views from 25 views respectively. The model that selects 1 view from 25 views is the same as the model that selects 3 views from 25 views.
python train_a2j.py --config config/nyu/train_25select15.yaml
python train_a2j.py --config config/nyu/train_25select9.yaml
python train_a2j.py --config config/nyu/train_25select3.yaml
The following commands train models using the "student" network to select 15, 9, 3 views from 25 views respectively.
The model that selects 1 view from 25 views is the same as the model that selects 3 views from 25 views.
This step requires the use of the trained teacher
confidence network,
please modify the pre_a2j
field of the configuration file to the path of the previously trained model.
python view_select_a2j.py --config config/nyu/train_25select15_light.yaml
python view_select_a2j.py --config config/nyu/train_25select9_light.yaml
python view_select_a2j.py --config config/nyu/train_25select3_light.yaml
We provide configuration files on ICVL hand posture dataset, you can follow the commands on NYU hand pose dataset and use corresponding configuration files to train.
Please cite this paper if you want to use it in your work,
@inproceedings{Cheng2022virtualview,
title={Efficient Virtual View Selection for 3D Hand Pose Estimation},
author={Jian Cheng, Yanguang Wan, Dexin Zuo, Cuixia Ma, Jian Gu, Ping Tan, Hongan Wang, Xiaoming Deng, Yinda Zhang},
booktitle={AAAI Conference on Artificial Intelligence (AAAI)},
year={2022}
}
We use part of the great code from A2J, HandAugment and attention-is-all-you-need-pytorch.