-
Notifications
You must be signed in to change notification settings - Fork 6.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recommended approach for custom validation #1115
Comments
The easiest way I suppose, that modifies 0 fairseq code, is to run a separate script over each of the validation checkpoints to calculate your SARI/FKGL (?), then select the checkpoint that way. The second easiest way I see that involves small modifications, is to insert this calculation during validation time, so when the model prints validation metrics, it's calculated as well, so you can grep the log files for your desired checkpoint. Otherwise, yes, you would need to modify the fairseq code to change to your custom metric if you want to override it. |
Thanks for the suggestions @huihuifan! The solution of using a separate script is what I've been using. It works but I was curious if there was a more 'elegant' way of implementing it in the code. I guess directly hacking the code is the only way to go. |
I think you'll need to make the slightly larger changes to be more elegant, the hardcoding of validation loss, etc metrics will need to be changed. |
Hello,@feralvam. I have the same problem as yours. I want to know 'The solution of using a separate script is what I've been using ', when i use the matric SARI instead of BLEU ? |
Hi,
Currently, the "best" model at validation time is chosen according to the value of the "loss". However, I would like for it to be chosen using another metric (a custom one that could behave like BLEU, for example). What would be the best way to implement this?
I noticed there is the
best-checkpoint-metric
parameter forfairseq-train
. But I am unsure about where this new function should be implemented so that it can be used by the trainer. The only example I could find is for fine-tuning RoBERTa on GLUE using accuracy. But thenaccuracy
is hardcoded intrain.py
(as far as I could notice). In addition,save_checkpoint
incheckpoint_utils.py
has "val_loss" hardcoded, too. Does this mean that I would need to change the code of core modules to incorporate the new validation metric?Thanks for any guidance you could provide.
The text was updated successfully, but these errors were encountered: