Skip to content
This repository has been archived by the owner on Dec 16, 2022. It is now read-only.

Refactor Seq2Vec #5388

Open
david-waterworth opened this issue Aug 29, 2021 · 0 comments
Open

Refactor Seq2Vec #5388

david-waterworth opened this issue Aug 29, 2021 · 0 comments

Comments

@david-waterworth
Copy link

Is your feature request related to a problem? Please describe.
Arguably the best future of Allen NLP is the ability to compose models from components. But I feel that the Seq2Vec implementation isn't overly composable. It is essentially a Seq2Seq contextualiser followed by a Pooler but you cannot mix and match contextualisers and poolers. I feel that it could be refactored somewhat.

In particular there are some missing pooling operations in Seq2Vec (max pooling, attention across encoder states) and there are some missing Seq2Seq encoders (cnn-highway).

Describe the solution you'd like
Create an abstract Seq2Vec with two steps, a Seq2Seq contextuliser followed by a Seq2Vec pooler. Refactor the existing models such that all the encoder steps are implemented as Seq2Seq and all the poolers are (encoderless) Seq2Vec's.

Describe alternatives you've considered
This proposal is along the lines of VAMPIRE (https://github.com/allenai/vampire) where they created an Encoder wrapper for this exact purpose.

Additional context
I find max pooling over the encoder states more effective that the last LSTM hidden state for my domain. I'd also like to experiment with attention over hidden states. Both of these required code despite being fairly common.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants