BAAM: Monocular 3D pose and shape reconstruction with bi-contextual attention module and attention-guided modeling
This repo is the official Code of BAAM: Monocular 3D pose and shape reconstruction with bi-contextual attention module and attention-guided modeling (CVPR 2023). [Paper]
We recommend you to use an Anaconda virtual environment with Python 3.9.
- Install pytorch 1.10.1, Detectron2, and Pytorch3D
#pytorch
conda install pytorch==1.10.1 torchvision==0.11.2 torchaudio==0.10.1 cudatoolkit=11.3 -c pytorch -c conda-forge
# detectron2
python -m pip install detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu113/torch1.10/index.html
# pytorch3d
pip install "git+https://github.com/facebookresearch/pytorch3d.git"
- Install Requirements
pip install -r requirement.txt
- Set data referring here.
First install pre-trained weights and place it in root [CODE] path. Then run the command below.
python main.py
- To stable model convergence, we first trained the 2D modules (Box, Keypoint) based on the pre-trainned COCO 2017 weights. You can downlod pre-trained 2D module weights (res2net_bifpn.pth) in here.
- Replace the third line of
configs/custom.yaml
-best_rel_model.pth
tores2net_bifpn.pth
. - Run the command below.
python main.py -t
- Finish either inference process or train process.
- Move to
evaluation
folder. - Run the command below.
python eval.py --light --test_dir ../outputs/res --gt_dir ../data/apollo/val/apollo_annot --res_file test_results.txt
- You can show A3DP results in
test_results.txt
.
- Install open3D python library
pip install open3d==0.14.1
note must use version 0.14.1
- Move to 'vis' folder.
- Run the command below.
python vis_apollo.py --output [path where the results are saved] --file [file name to vis] --save [path to save vis results]
python vis_apollo.py --output ../outputs --file 171206_081122658_Camera_5 --save vis_results #example
- You can see a manual to handle open3D UI here.
- You can see the vis results at [save] path.
- [file].image_plane.png : vis results rendered on an image plane.
- [file].3d.png: vis results of your own rendering with open3d UI.
We achieved the state-of-the art on Apollocar3D dataset.
A MIT license is used for this repository. Note that the used dataset (ApolloCar3D) is subject to their respective licenses and may not grant commercial use.