-
Notifications
You must be signed in to change notification settings - Fork 3
Awesome Human Action Recognition Repository
list the most popular methods about human action recognition
[arXiv:1808.07507] Model-based Hand Pose Estimation for Generalized Hand Shape with Appearance Normalization. [PDF]
Unaiza Ahsan,Rishi Madhok
[arXiv:1711.04161] End-to-end Video-level Representation Learning for Action Recognition. [PDF][code]
Jiagang Zhu, Wei Zou, Zheng Zhu
[2017 IEEE Access:TPAMI] Long-Term Temporal Convolutions for Action Recognition [PDF]
Gul Varol , Ivan Laptev, and Cordelia Schmid, Fellow, IEEE
Human Action Recognition and Prediction: A Survey [PDF]
Yu Kong, Member, IEEE, and Yun Fu, Senior Member, IEEE
Graph Convolutional Networks for Temporal Action Localization 作者:Chuang Gan 等
Action recognition with spatial-temporal discriminative filter banks 作者:Yuanjun Xiong 等
AssembleNet: Searching for Multi-Stream Neural Connectivity in Video Architectures 作者:Google Brain
neural architecture search for video understanding——大力出奇迹
DynamoNet: Dynamic Action and Motion Network 作者:Ali Diba Luc Van Gool
Reasoning About Human-Object Interactions Through Dual Attention Networks 作者:Bolei Zhou
Learning Temporal Action Proposals with Fewer Labels 作者:Stanford Feifei组 Juan Carlos Niebles
EPIC-Fusion: Audio-Visual Temporal Binding for Egocentric Action Recognition 作者:Dima Damen 等
SlowFast Networks for Video Recognition (文章链接:https://arxiv.org/abs/1812.03982) kaiming 大神 from FAIR
Video Classification with Channel-Separated Convolutional Networks (文章链接:https://arxiv.org/abs/1904.02811) Du Tran 大神 from FAIR
SCSampler: Sampling Salient Clips from Video for Efficient Action Recognition. oral (文章链接:https://arxiv.org/abs/1904.04289) Du Tran 大神 from FAIR
DistInit: Learning Video Representations without a Single Labeled Video. (文章链接:https://arxiv.org/abs/1901.09244) Du Tran 大神 from FAIR 很简单的思路
TSM: Temporal Shift Module for Efficient Video Understanding 作者:Ji Lin, Chuang Gan, Song Han 论文链接:https://arxiv.org/abs/1811.08383 Github链接:https://github.com/mit-han-lab/temporal-shift-module emmm感觉吧,就像是搞了个带Mask的固定卷积核?
BMN: Boundary-Matching Network for Temporal Action Proposal Generation (文章链接:https://arxiv.org/abs/1907.09702) 来自作者大大解读:林天威:[ICCV 2019][时序动作提名] 边界匹配网络详解 (原文链接:https://zhuanlan.zhihu.com/p/75444151)
Weakly Supervised Energy-Based Learning for Action Segmentation.oral 文章链接:https://github.com/JunLi-Galios/CDFL
Pose-aware Dynamic Attention for Human Object Interaction Detection 文章链接:https://github.com/bobwan1995/PMFNet
What Would You Expect? Anticipating Egocentric Actions With Rolling-Unrolling LSTMs and Modality Attention 项目链接:https://iplab.dmi.unict.it/rulstm/ 论文链接:https://arxiv.org/pdf/1905.09035.pdf GitHub:https://github.com/fpv-iplab/rulstm
Fine-Grained Action Retrieval Through Multiple Parts-of-Speech Embeddings 论文链接:https://arxiv.org/abs/1908.03477 项目链接:https://mwray.github.io/FGAR/
HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million Narrated Video Clips 作者:Antoine Miech, Dimitri Zhukov, Jean-Baptiste Alayrac, Makarand Tapaswi, Ivan Laptev, Josef Sivic 论文链接:https://arxiv.org/abs/1906.03327 项目链接:https://github.com/antoine77340/howto100m code(链接:https://github.com/antoine77340/howto100m)
Temporal Attentive Alignment for Large-Scale Video Domain Adaptation 作者:Min-Hung Chen, Zsolt Kira, Ghassan AlRegib, Jaekwon Woo, Ruxin Chen, Jian Zheng 论文链接:https://arxiv.org/abs/1907.12743 Github链接:https://github.com/cmhungsteve/TA3N
STM- SpatioTemporal and Motion Encoding for Action Recognition from ZJU && SenseTime Group Limited 论文链接:https://arxiv.org/abs/1908.02486
[2018,ECCV] Modality Distillation with Multiple Stream Networks for Action Recognition [PDF]
Bolei Zhou, Alex Andonian, Aude Oliva, and Antonio Torralba
[2018,ECCV] Graph Distillation for Action Detection with Privileged Modalities [PDF]
Stanford University 2 Google Inc.
[2018,ECCV] Spatio-Temporal Channel Correlation Networks for Action Classification [PDF]
Siyuan Qi, Wenguan Wang, Baoxiong Jia, Jianbing Shen, Song-Chun Zhu
[2018,ECCV] Interaction-aware Spatio-temporal Pyramid Attention Networks for Action Classification[PDF]
Yang Du,Chunfeng Yuan, Bing Li, Lili Zhao, Yangxi Li and Weiming Hu
[2018,ECCV] Action Search: Spotting Actions in Videos and Its Application to Temporal Action Localization[PDF]
Humam Alwassel, Fabian Caba Heilbron, and Bernard Ghanem
[2018,ECCV] Action Anticipation with RBF Kernelized Feature Mapping RNN [PDF]
Yuge Shi, Basura Fernando, Richard Hartley
[2018,ECCV] Skeleton-Based Action Recognition with Spatial Reasoning and Temporal Stack Learning[PDF]
Chenyang Si, Ya Jing, Wei Wang, Liang Wang, Tieniu Tan
Jamie Ray, Heng Wang, Du Tran, Yufei Wang, Matt Feiszli, Lorenzo Torresani, Manohar Paluri
[2018,ECCV] End-to-End Joint Semantic Segmentation of Actors and Actions in Video [PDF]
[2018,ECCV] Scenes-Objects-Actions: A Multi-Task, Multi-Label Video Dataset [PDF]
Jamie Ray1, Heng Wang1, Du Tran1 Yufei Wang1 ,etc
Video Action Recognition [PDF] Shuyang Sun, Zhanghui Kuang, Wanli Ouyang, Lu Sheng, Wei Zhang
L. Wang, W. Li, W. Li, and L. Van Gool
Yue Zhao, Yuanjun,Xiong
AdaScan: Adaptive Scan Pooling in Deep Convolutional Neural Networks for Human Action Recognition in Videos [PDF]
Amlan Kar, Nishant Rai, Karan Sikka,Gaurav Sharma
[2017,CVPR] On the Integration of Optical Flow and Action Recognition [PDF]
Laura Sevilla-Lara, Yiyi Liao, Fatma Guney, Varun Jampani, Andreas Geiger, Michael J. Black
[2016,CVPR] Convolutional Two-Stream Network Fusion for Video Action Recognition[PDF]
Christoph Feichtenhofer,Axel Pinz,Andrew Zisserman
[2016,CVPR] A Key Volume Mining Deep Framework for Action Recognition[PDF]
Wangjiang Zhu,Jie Hu,Gang Sun,Xudong Cao,Yu Qiao
[2016,ECCV] Temporal Segment Networks: Towards Good Practices for Deep Action Recognition [PDF]
Limin Wang,Yuanjun XiongZhe WangYu QiaoDahua LinXiaoou TangLuc Van Gool
[2015,CVPR] Action Recognition with Trajectory-Pooled Deep-Convolutional Descriptors [PDF]
Limin Wang, Yu Qiao, Xiaoou Tang
[2015,ICCV] Learning Spatiotemporal Features with 3D Convolutional Networks [PDF]
D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri
[2014,CVPR] Large-Scale Video Classification with Convolutional Neural Networks [PDF]
A Karpathy , G Toderici , S Shetty , T Leung , R Sukthankar,L. Fei-Fei
[2014,NIPS] Two-Stream Convolutional Networks for Action Recognition in Videos[PDF]
Karen Simonyan, Andrew Zisserman
Two-Stream Convolutional Networks for Action Recognition in Videos
AdaScan: Adaptive Scan Pooling in Deep Convolutional Neural Networks for Human Action Recognition in Videos [PDF]
Karen Simonyan, Andrew Zisserman
Here we pay more attention on DL methods as follows.
AdaScan: Adaptive Scan Pooling in Deep Convolutional Neural Networks for Human Action Recognition in Videos [PDF]
Amlan Kar, Nishant Rai, Karan Sikka,Gaurav Sharma
[2014,IEEE Acess:TPAMI] 3D Convolutional Neural Networks for Human Action Recognition
Shuiwang Ji ,Wei Xu,Ming Yang ,Kai Yu
[2017 IEEE Access:TPAMI] Long-Term Temporal Convolutions for Action Recognition [PDF]
Gul Varol , Ivan Laptev, and Cordelia Schmid, Fellow, IEEE
[2014,NIPS] Two-Stream Convolutional Networks for Action Recognition in Videos[PDF]
[2016,ECCV] Temporal Segment Networks: Towards Good Practices for Deep Action Recognition [PDF]
[2016,CVPR] A Key Volume Mining Deep Framework for Action Recognition[PDF]
Video Action Recognition [PDF] _Shuyang Sun, Zhanghui Kuang, Wanli Ouyang, Lu Sheng, Wei Zhang
[2015,CVPR] Action Recognition with Trajectory-Pooled Deep-Convolutional Descriptors [PDF]
[2017,CVPR] On the Integration of Optical Flow and Action Recognition [PDF]
Laura Sevilla-Lara, Yiyi Liao, Fatma Guney, Varun Jampani, Andreas Geiger, Michael J. Black
[arXiv:1712.08416] What have we learned from deep representations for action recognition?
Laura Sevilla-Lara, Yiyi Liao, Fatma Guney, Varun Jampani, Andreas Geiger, Michael J. Black
[arXiv:1802] Structured Label Inference for Visual Understanding Nelson Nauata, Hexiang Hu, Guang-Tong Zhou, Zhiwei Deng, Zicheng Liao and Greg Mori
[2018,ECCV] Scenes-Objects-Actions: A Multi-Task, Multi-Label Video Dataset [PDF]
- Year: publish date
- Videos: amount of flips
- Views: amount of view angles
- Actions: amount of action class
- Subjects: people in Videos
- Modility: RGB or RGB-D
- Env: Controlled(C) or Uncontrolled(U)
dataset papers 2017 [PDF]
2018 video benchmarks: a review[PDF]
video datasets online(html)[HTML]
compute vision datasets online[HTML]
Dataset | Year | Videos | Views | Actions | Subjects | Modility | Env(C\U) | Related Paper |
---|---|---|---|---|---|---|---|---|
KTH | 2004 | 599 | 1 | 6 | 25 | RGB | C | Recognizing human actions: A local svm approach, IEEE ICPR 2004 [PDF] |
HMDB51 | 2011 | 7000 | - | 51 | - | RGB | U | LHmdb: A large video database for human motion recognition, ICCV 2011 [PDF] |
UCF101 | 2012 | 13320 | - | 101 | - | RGB | U | Ucf101: A dataset of 101 human action classes from videos in the wild, 2012,cRCV-TR-12-01 [PDF] |
- HDMB51 82.1% 2017