Class CorefTaggerReview means what? #4

smallsmallwood · 2018-12-10T07:09:26Z

I find the output is 12 vector in Class CorefTagger，and the final output y is a 13 vector in your paper. Are there any differences?

another question: did you test on ".auto_conll" in you paper (CONLLING 2018)

ylmeng · 2018-12-10T07:56:02Z

Sorry for the confusion.
For a triad (a, b, c), we only use output for (a, c) and (b, c) in the current version. So a triad can have three pairwise outputs, but we use two of them for final predictions. It is more efficient and often times more accurate. However, inside the neural network, there is no change from the original version. All three pairs still go through the layers. If you use three pairs, the scores should be very similar, maybe a little bit lower.

Test set does not have gold_conll so we used auto_conll only. We had some bug in the evaluation program for our COLING paper so the scores are not as good as current one. Specifically, separate parts of an article do not have coreference in between, but we had assumed coreference could occur across the parts, and made the task more difficult.
After we fixed the bug the scores get better, as you can see in the Arxiv paper. Please refer to Arxiv paper, which corrected some errors. (We tried to update the COLING paper too but the process is longer).

smallsmallwood · 2018-12-10T13:01:47Z

Thanks for your patience.
I also don't understand the role of the operation torch.max(), as follows:(I did‘n find some analyses in your new paper)

word_repr_0, _ = self.Attention(word_lstm_0, torch.cat([word_lstm_1, word_lstm_2], 1))
word_repr_0, _ = torch.max(word_repr_0, dim=1, keepdim=False) # (batch, feature)

word_repr_1, _ = self.Attention(word_lstm_1, torch.cat([word_lstm_0, word_lstm_2], 1))
word_repr_1, _ = torch.max(word_repr_1, dim=1, keepdim=False) # (batch, feature)

word_repr_2, _ = self.Attention(word_lstm_2, torch.cat([word_lstm_0, word_lstm_1], 1))
word_repr_2, _ = torch.max(word_repr_2, dim=1, keepdim=False) # (batch, feature)

It is lower than your results tested with gold mentions in your new paper.

ylmeng · 2019-01-24T00:11:57Z

Sorry for the delay. torch.max() just does the max-pooling, which is widely used for RNN-based models.
So instead of using the output of the last time step, or the average over time steps, we use the max value over time steps to represent the sequence.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Class CorefTaggerReview means what? #4

Class CorefTaggerReview means what? #4

smallsmallwood commented Dec 10, 2018

ylmeng commented Dec 10, 2018

smallsmallwood commented Dec 10, 2018

ylmeng commented Jan 24, 2019

Class CorefTaggerReview means what? #4

Class CorefTaggerReview means what? #4

Comments

smallsmallwood commented Dec 10, 2018

ylmeng commented Dec 10, 2018

smallsmallwood commented Dec 10, 2018

ylmeng commented Jan 24, 2019