Skip to content

[CVPR 2023] This is the official implementation of "Weakly Supervised Video Emotion Detection and Prediction via Cross-Modal Temporal Erasing Network"

Notifications You must be signed in to change notification settings

nku-zhichengzhang/CTEN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Weakly Supervised Video Emotion Detection and Prediction via Cross-Modal Temporal Erasing Network [CVPR2023]

Zhicheng Zhang, Lijuan Wang, and Jufeng Yang

PyTorch Conference License

This is the official implementation of our CVPR 2023 paper.

News

  • Adding comments
  • reconstruct code

Publication

Weakly Supervised Video Emotion Detection and Prediction via Cross-Modal Temporal Erasing Network
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023.
[Paper] [PDF] [Video] [Demo]

Abstract

Automatically predicting the emotions of user-generated videos (UGVs) receives increasing interest recently. However, existing methods mainly focus on a few key visual frames, which may limit their capacity to encode the context that depicts the intended emotions. To tackle that, in this paper, we propose a cross-modal temporal erasing network that locates not only keyframes but also context and audio-related information in a weakly-supervised manner. In specific, we first leverage the intra- and inter-modal relationship among different segments to accurately select keyframes. Then, we iteratively erase keyframes to encourage the model to concentrate on the contexts that include complementary information. Extensive experiments on three challenging benchmark datasets demonstrate that the proposed method performs favorably against the state-of-the-art approaches.

Running

You can easily train and evaluate the model by running the script below.

You can adjust more details such as epoch, batch size, etc. Please refer to opts.py.

$ bash run.sh

The used datasets are provided in Ekman-6, VideoEmotion-8, and CAER.

References

We referenced the repo of VAANet for the code.

Citation

If you find this repo useful in your project or research, please consider citing the relevant publication.

Bibtex Citation

@InProceedings{Zhang_2023_CVPR,
    author    = {Zhang, Zhicheng and Wang, Lijuan and Yang, Jufeng},
    title     = {Weakly Supervised Video Emotion Detection and Prediction via Cross-Modal Temporal Erasing Network},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2023},
    pages     = {18888-18897}
}

About

[CVPR 2023] This is the official implementation of "Weakly Supervised Video Emotion Detection and Prediction via Cross-Modal Temporal Erasing Network"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages