-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to reproduce the domainbed accuracies #4
Comments
I am looking for detail configs but here I can provide some results for those three datasets, I can paste the exact lr and wd and layer configurations for GMoE-S/16 used to reproduce these results. I think the difference may comes from MOE layer configuration. Especially on small datasets (VLCS PACS), they are very sensitive. For PACS lr: 3e-05
weight_decay: 1e-06
moe_layers=['F'] * 8 + ['S', 'F'] * 2, mlp_ratio=4., num_experts=6, drop_path_rate=0, router='top' The results.txt is (in our previous version, the model is called SFMOE) -------- Dataset: PACS, model selection method: training-domain validation set
Algorithm A C P S Avg
SFMOE 89.9 +/- 0.0 82.1 +/- 0.0 99.2 +/- 0.0 81.2 +/- 0.0 88.1 For VLCS
The results are -------- Dataset: VLCS, model selection method: training-domain validation set
Algorithm C L S V Avg
SFMOE 98.1 +/- 0.0 66.0 +/- 0.0 75.8 +/- 0.0 80.9 +/- 0.0 80.2 For OfficeHome
The results are -------- Dataset: OfficeHome, model selection method: training-domain validation set
Algorithm A C P R Avg
GMOE_Tutel 72.6 +/- 0.0 58.9 +/- 0.0 81.3 +/- 0.0 83.9 +/- 0.0 74.2 |
Hello! |
Hi! I think at that time, I only ran with 1 seed to quickly provide the reproduced results. I need sometime to find the Terra's configs since it's been a long time, I will get you back as soon as possible! |
Thank you for your reply! I have done some GMoE multi-seed experiments and often found that acc decreases as more seeds are added, could you teach me how to maintain a high acc even with multiple seeds? Or could you please tell me the specific muti-seed that you used in experiments? |
Hello!
I am facing trouble reproducing the accuracies reported in paper(https://arxiv.org/pdf/2206.04046v5.pdf). I used the default hyperparameters mentioned in the paper, and obtained the following results:
Can you please help me with this issue? Am I missing any implementation details?
The text was updated successfully, but these errors were encountered: