You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In Gigaword dataset there are some examples where the summary is longer than the source sequence. Sometimes the sourse is a single unk word. As I can see in dataclass.py, such examples are dropped from the pipeline completely.
Were the rouge scores reported in the paper computed without those examples? If yes, then it is incorrect to compare the resulting scores with the baselines. For example, as I can see, the rouge scores for Concept Pointer were taken directly from the paper, where they measured the performance on all test examples.
The text was updated successfully, but these errors were encountered:
In Gigaword dataset there are some examples where the summary is longer than the source sequence. Sometimes the sourse is a single
unk
word. As I can see indataclass.py
, such examples are dropped from the pipeline completely.Were the rouge scores reported in the paper computed without those examples? If yes, then it is incorrect to compare the resulting scores with the baselines. For example, as I can see, the rouge scores for Concept Pointer were taken directly from the paper, where they measured the performance on all test examples.
The text was updated successfully, but these errors were encountered: