This is official implementation of our paper: Accumulated Trivial Attention Matters in Vision Transformers on Small Datasets Our codes are highly adpated from SPT_LSA_ViT