Skip to content

Task-adaptive Spatial-Temporal Video Sampler for Few-shot Action Recognition

Notifications You must be signed in to change notification settings

R00Kie-Liu/Sampler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Task-adaptive Spatial-Temporal Video Sampler for Few-shot Action Recognition

Please refer to https://arxiv.org/abs/2207.09759 for this paper.

Introduction


A uniform sampler may overlook frames containing key actions. Critical regions involving the actors and objects may be too small to be properly recognized. (b) Our sampler can (I) select frames from an entire video that contribute most to few-shot recognition, (II) amplify discriminative regions in each frame. This sampling strategy is also dynamically adjusted for each video according to the episode task at hand.

Overview


Todo

  • Realease the ActivityNet dataset few-shot split file.
  • Realease the core part of Sampler.
  • Code of Sampler + ProtoNet.
  • Realease the whole training and inference code.
  • Sampler + TA2N/TRX/OTAM.

Usage

Example of spatial-temporal sampling from input query video set

from sampler import Selector

args = ArgsObject() # Refer to sampler.py for details of args
S = Selector(args).cuda()
input = torch.rand(args.way*args.shot*args.seq_len, 3, args.img_size, args.img_size).cuda() 
# Input: way*shot*frame, c, w, h
n, c, w, h = input.size()
print('Input Data shape:', input.shape)

# Indice: way*shot, k, len
indices,_,_,_,_,_ = S(input)
input = input.view(args.way*args.shot, args.seq_len, -1) 
# Output: way*shot, k, c*w*h
subset = torch.bmm(indices, input)
# Output: way*shot*k, c, w, h
subset = subset.view(-1, c, w, h)
print('Data shape output by sampler:', subset.shape)

Bibtex

If you find our work helpful for your research, please consider citing the following BibTeX entry.

@inproceedings{liu2022task,
  title={Task-adaptive Spatial-Temporal Video Sampler for Few-shot Action Recognition},
  author={Liu, Huabin and Lv, Weixian and See, John and Lin, Weiyao},
  booktitle={Proceedings of the 30th ACM International Conference on Multimedia},
  pages={6230--6240},
  year={2022}
}

Contact

Please feel free to contact [email protected] if you have any questions.

About

Task-adaptive Spatial-Temporal Video Sampler for Few-shot Action Recognition

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published