You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Dec 16, 2022. It is now read-only.
Is your feature request related to a problem? Please describe.
Arguably the best future of Allen NLP is the ability to compose models from components. But I feel that the Seq2Vec implementation isn't overly composable. It is essentially a Seq2Seq contextualiser followed by a Pooler but you cannot mix and match contextualisers and poolers. I feel that it could be refactored somewhat.
In particular there are some missing pooling operations in Seq2Vec (max pooling, attention across encoder states) and there are some missing Seq2Seq encoders (cnn-highway).
Describe the solution you'd like
Create an abstract Seq2Vec with two steps, a Seq2Seq contextuliser followed by a Seq2Vec pooler. Refactor the existing models such that all the encoder steps are implemented as Seq2Seq and all the poolers are (encoderless) Seq2Vec's.
Describe alternatives you've considered
This proposal is along the lines of VAMPIRE (https://github.com/allenai/vampire) where they created an Encoder wrapper for this exact purpose.
Additional context
I find max pooling over the encoder states more effective that the last LSTM hidden state for my domain. I'd also like to experiment with attention over hidden states. Both of these required code despite being fairly common.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
Arguably the best future of Allen NLP is the ability to compose models from components. But I feel that the Seq2Vec implementation isn't overly composable. It is essentially a Seq2Seq contextualiser followed by a Pooler but you cannot mix and match contextualisers and poolers. I feel that it could be refactored somewhat.
In particular there are some missing pooling operations in Seq2Vec (max pooling, attention across encoder states) and there are some missing Seq2Seq encoders (cnn-highway).
Describe the solution you'd like
Create an abstract Seq2Vec with two steps, a Seq2Seq contextuliser followed by a Seq2Vec pooler. Refactor the existing models such that all the encoder steps are implemented as Seq2Seq and all the poolers are (encoderless) Seq2Vec's.
Describe alternatives you've considered
This proposal is along the lines of VAMPIRE (https://github.com/allenai/vampire) where they created an Encoder wrapper for this exact purpose.
Additional context
I find max pooling over the encoder states more effective that the last LSTM hidden state for my domain. I'd also like to experiment with attention over hidden states. Both of these required code despite being fairly common.
The text was updated successfully, but these errors were encountered: