Option to output "test predictions" text file with each checkpoint in run_seq2seq.py #10381

kingpalethe · 2021-02-24T18:32:26Z

Further to this discussion:

https://discuss.huggingface.co/t/how-to-output-test-generations-txt-with-run-seq2seq-py/3825

The prior incarnation of this script would output test generations at each checkpoint, which was very useful for understanding the progress of model training.

The current script...

https://github.com/huggingface/transformers/blob/master/examples/seq2seq/run_seq2seq.py

Seems to only output this text file once, at the end of the last epoch.

If there was a way to enable the previous behavior, I am guessing that would be widely useful.

thanks

LysandreJik · 2021-02-24T23:44:04Z

May be of interest to @patil-suraj @stas00 @sgugger

stas00 · 2021-02-24T23:58:05Z

Yes, as I replied in the forums, this functionality was dropped - not sure why it was done, as I wasn't part part of the planning discussion.

I think it was not intentional, the devs were probably unaware it was used and given that the example tests were dropped too it's not surprising it was missed. I propose the dropped examples tests are restored (which will require porting to the new script) which will expose some of the functionality that was removed with it.

Practically, let's identify what else might have been removed and create separate issues besides this one and may be ask the community to help restore/backport the previously working things to the new script(s)?

e.g. one such important thing is the tests that were moved to legacy, so this script is no longer being tested.

p.s. this should be of help restoring/porting the example tests #10036

stas00 · 2021-02-25T16:53:42Z

@bhadreshpsavani, please let us know if you're inspired to take care of this in:
#10337 (comment)
Thank you.

bhadreshpsavani · 2021-02-25T17:44:16Z

Sure @stas00,
I can take care of this with a separate PR or if possible in the same PR,
Thanks

stas00 · 2021-02-26T21:41:49Z

Correction, as I was refactoring run_seq2seq.py I can see now that the code wasn't removed - it's exactly the same. Someone decided to rename the resulting file instead. So the feature hasn't been removed, just renamed.

I'm not attached to either,

the original was saving it as "test_generations.txt"
the new one as "test_preds_seq2seq.txt"

I think the original name is the most intuitive one.

@sgugger, do you have an opinion here?

stas00 · 2021-02-26T21:43:07Z

@bhadreshpsavani, so please hold a moment while we are re-modelling run_seq2seq.py and then I will update you when the model example is ready to be synced. Thank you!

@sgugger

This PR restores the original functionality that for some reason was modified. Fixes: #10381 @sgugger

stas00 · 2021-02-26T21:52:17Z

PR to restore the original functionality: #10428

@sgugger

#10428) This PR restores the original functionality that for some reason was modified. Fixes: #10381 @sgugger

stas00 · 2021-02-27T16:23:56Z

OK, the original name has been restored as it used to be, @kingpalethe

As I mentioned in #10428 if you'd like to request a new feature to do this on each check point please don't hesitate to make such request.

kingpalethe · 2021-02-27T23:42:31Z

@stas00 thanks -- apologies, you are correct. I had hallucinated this behavior. I made a new issue: #10439

stas00 · 2021-02-27T23:46:54Z

All is good.

and now I see that my PR made that script inconsistent with other scripts, but perhaps all scripts should use the same filename for test_generations.txt. I can't quite see the point of it having a different name in each script.

stas00 mentioned this issue Feb 25, 2021

[trainer] port metrics logging and saving methods to all example scripts #10337

Closed

stas00 added a commit that referenced this issue Feb 26, 2021

[run_seq2seq.py] restore functionality: saving to test_generations.txt

23ebef7

This PR restores the original functionality that for some reason was modified. Fixes: #10381 @sgugger

stas00 mentioned this issue Feb 26, 2021

[run_seq2seq.py] restore functionality: saving to test_generations.txt #10428

Merged

stas00 closed this as completed in #10428 Feb 27, 2021

stas00 added a commit that referenced this issue Feb 27, 2021

[run_seq2seq.py] restore functionality: saving to test_generations.txt (

f52a158

#10428) This PR restores the original functionality that for some reason was modified. Fixes: #10381 @sgugger

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Option to output "test predictions" text file with each checkpoint in run_seq2seq.py #10381

Option to output "test predictions" text file with each checkpoint in run_seq2seq.py #10381

kingpalethe commented Feb 24, 2021

LysandreJik commented Feb 24, 2021

stas00 commented Feb 24, 2021 •

edited

Loading

stas00 commented Feb 25, 2021

bhadreshpsavani commented Feb 25, 2021

stas00 commented Feb 26, 2021

stas00 commented Feb 26, 2021

stas00 commented Feb 26, 2021

stas00 commented Feb 27, 2021

kingpalethe commented Feb 27, 2021

stas00 commented Feb 27, 2021

Option to output "test predictions" text file with each checkpoint in run_seq2seq.py #10381

Option to output "test predictions" text file with each checkpoint in run_seq2seq.py #10381

Comments

kingpalethe commented Feb 24, 2021

LysandreJik commented Feb 24, 2021

stas00 commented Feb 24, 2021 • edited Loading

stas00 commented Feb 25, 2021

bhadreshpsavani commented Feb 25, 2021

stas00 commented Feb 26, 2021

stas00 commented Feb 26, 2021

stas00 commented Feb 26, 2021

stas00 commented Feb 27, 2021

kingpalethe commented Feb 27, 2021

stas00 commented Feb 27, 2021

stas00 commented Feb 24, 2021 •

edited

Loading