Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproduce on DiDeMo dataset #18

Open
dengrui-64 opened this issue Jan 19, 2024 · 5 comments
Open

Reproduce on DiDeMo dataset #18

dengrui-64 opened this issue Jan 19, 2024 · 5 comments

Comments

@dengrui-64
Copy link

dengrui-64 commented Jan 19, 2024

Hi, we appreciate your two papers and have thoroughly examined them.

The replication process for the MSRVTT results on Mug-STAN was successful, yielding outcomes that closely align with the paper's findings.

However, we encountered some difficulties while attempting to replicate the DiDeMo dataset. Our achieved scores were only 46.3% on R@1 and 72.4% on R@5, both of which fall short of the reported results in the paper (49.6% on R@1 and 75.3% on R@5).

Here are our reproduced results. Can you give me some advice on how to attain the desired results?

Results:

@dengrui-64 dengrui-64 changed the title reproduce on DiDeMo dataset Reproduce on DiDeMo dataset Jan 19, 2024
@dengrui-64
Copy link
Author

@farewellthree
Copy link
Owner

Didemo may need more GPUs to keep the batch size as 128. Are the frame number (64) and batch size (128) both right?

@dengrui-64
Copy link
Author

dengrui-64 commented Jan 23, 2024

Thank you for your reply. I have reviewed my experiment configurations on DiDeMo and ensured the use of batch_size=128. Specifically, I modified the training batch size to 16 and utilized 8 GPUs. Moreover, I observed that gradient_checkpointing is set to True in mugstan_didemo_b32_hf.py(Line6). Will this parameter have an impact on the results?

@farewellthree
Copy link
Owner

Theoretically no effect. Is the testing split right? In previous work, it seems that finetuning and zero-shot testing use different splits. See https://github.com/OpenGVLab/unmasked_teacher/blob/main/multi_modality/DATASET.md

@dengrui-64
Copy link
Author

Thank you for your prompt response. Would it be possible for you to provide us with your annotation files? This will allow us to align the results accurately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants