ActionFromFuturePose

This project uses pose information (can be extracted from openpose) to infer human action in short videos. There are three different models in this code base.

Why Use Pose?

Action Recognition using RGB frames can overfit on the background easily
We tested ResNet and GCN on two datasets each with 3 actions: stand, crouch, walk, pose one tranfered a lot better!

Spatial Temporal Graph Conv Net Link

Spatial edges connect joints
Temporal edges connect same joint across time
Proposed different designs for kernels

Pose-VAE

Extract features from skeleton at each time step using GCN
Use an LSTM to predict the next encoded feature

Spatial Temporal 3D Convolution

Each joint is represented as a low-res image
25 joints are stacked to form a 25 channel image
Preserves spatial relation between joints

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
config		config
data		data
imgs		imgs
models		models
papers		papers
utils		utils
README.md		README.md
results.txt		results.txt
run_cnn_FP.py		run_cnn_FP.py
run_stgcn.py		run_stgcn.py
run_stgcn_AR.py		run_stgcn_AR.py
run_stgcn_FP.py		run_stgcn_FP.py
run_stgcn_VAE.py		run_stgcn_VAE.py
visualize.ipynb		visualize.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ActionFromFuturePose

Why Use Pose?

Spatial Temporal Graph Conv Net Link

Pose-VAE

Spatial Temporal 3D Convolution

About

Releases

Packages

Contributors 2

Languages

FredHuangBia/ActionFromFuturePose

Folders and files

Latest commit

History

Repository files navigation

ActionFromFuturePose

Why Use Pose?

Spatial Temporal Graph Conv Net Link

Pose-VAE

Spatial Temporal 3D Convolution

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages