Contextual Explainable Video Representation: Human Perception-based Understanding

In this repo, we re-collect a series of our existing works on the contextual explainable video representation.

Our works on Temporal Action Proposals Generation are summarized as follows:

Our works on Video Paragraph Captioning are summarized as follows:

Vision-Language with Contrastive Learning for Coherent Video Paragraph Captioning (VLCap, published in ICIP 2022):
- Paper: https://ieeexplore.ieee.org/abstract/document/9897766
- ArXiv: https://arxiv.org/abs/2206.12972
- Source code: https://github.com/UARK-AICV/VLCAP
VLTinT: Visual-Linguistic Transformer-in-Transformer for Coherent Video Paragraph Captioning (VlLTinT, published in AAAI 2023):
- ArXiv: https://arxiv.org/abs/2211.15103
- Source code: https://github.com/UARK-AICV/VLTinT

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md

Provide feedback