You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was wondering if you have the script to reproduce zero-shot numbers on SSv2 (Table 7).
Based on my experiments and also other papers [1, 2], I get 2.7% accuracy on the 174 classes in SSv2 with a frozen CLIP with mean pooling on per-frame features. Could you please elaborate on this discrepancy or what I may be missing?
Dear authors,
Great work!
I was wondering if you have the script to reproduce zero-shot numbers on SSv2 (Table 7).
Based on my experiments and also other papers [1, 2], I get 2.7% accuracy on the 174 classes in SSv2 with a frozen CLIP with mean pooling on per-frame features. Could you please elaborate on this discrepancy or what I may be missing?
[1] Videoprompter: an ensemble of foundational models for zero-shot video understanding. https://arxiv.org/pdf/2310.15324
[2] GPT4Vis: What Can GPT-4 Do for Zero-shot Visual Recognition? https://arxiv.org/pdf/2311.15732
The text was updated successfully, but these errors were encountered: