You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have been unable to reproduce the results shown in the paper. I have trained the model for 20k steps and the loss has fallen nicely throughout training. When I produce the output summaries, however (using mode=decode- I presume this is correct?) they are not good. An illustrative output summary is shown below. If I resume from that checkpoint and train the model further, it says loss is NAN and stops training.
What am I missing here? The command I use to train the model is:
Illustrative example
background : of of .
.
the the under of either of the private public private medicine medicine private private other has successfully investigated .
here we by case of this by in first chronic chronic of of the the private of the patients [UNK] 75 symptoms causing the .
it history .
this is method the first successful chronic mortality.19 without chronic chronic of .
, [ , the condition the percentage of adult .
without mortality.19 mortality.19 the with without without without without without of without without without without of other private private the other .
results results the would suggest and identifying private private private of malignancy improve increases .
we also demonstrated the susceptibility and new new elderly this report chronic chronic .
chronic of of with with asthma4 without asthma4 the significantly higher .
there , it it greater greater than .
The text was updated successfully, but these errors were encountered:
I think at 20K steps the model is still undertrained.
I suggest starting with a smaller section length and number of sections and then at final steps increasing those. Something like --max_section_len=400, --num_sections=4, --max_dec_steps=100, -max_enc_steps=1600
I would start from scratch. I also remember seeing some nan issues although this was a while ago (as far as I recall nan's were more likely to occur in longer sequences).
I have been unable to reproduce the results shown in the paper. I have trained the model for 20k steps and the loss has fallen nicely throughout training. When I produce the output summaries, however (using mode=decode- I presume this is correct?) they are not good. An illustrative output summary is shown below. If I resume from that checkpoint and train the model further, it says loss is NAN and stops training.
What am I missing here? The command I use to train the model is:
python run_summarization.py
--mode=train
--data_path=$DATA_DIR/train.bin
--vocab_path=$DATA_DIR/vocab
--log_root=logroot
--exp_name=exp
--max_dec_steps=210
--max_enc_steps=2500
--num_sections=5
--max_section_len=500
--batch_size=1
--vocab_size=50000
--use_do=True
--optimizer=adagrad
--do_prob=0.25
--hier=True
--split_intro=True
--fixed_attn=True
--legacy_encoder=False
--coverage=False
--lr=0.05
Illustrative example
background : of of .
.
the the under of either of the private public private medicine medicine private private other has successfully investigated .
here we by case of this by in first chronic chronic of of the the private of the patients [UNK] 75 symptoms causing the .
it history .
this is method the first successful chronic mortality.19 without chronic chronic of .
, [ , the condition the percentage of adult .
without mortality.19 mortality.19 the with without without without without without of without without without without of other private private the other .
results results the would suggest and identifying private private private of malignancy improve increases .
we also demonstrated the susceptibility and new new elderly this report chronic chronic .
chronic of of with with asthma4 without asthma4 the significantly higher .
there , it it greater greater than .
The text was updated successfully, but these errors were encountered: