[WIP v1 - deprecated] entmax 1.5 for attention and outputs, faster implementation of sparsemax #1541

bpopeters · 2019-08-23T17:12:18Z

This pull request adds support for entmax 1.5, a sparse alternative to softmax which we describe in our ACL paper, Sparse Sequence-to-Sequence Models. It uses the implementations of sparsemax and entmax-1.5 from entmax package, available from pip.

This pull request does not include support for entmax with other alpha values. I suspect the code for that will be a little bit more involved and I can get to it soon.

It also does not include support for entmax attention in transformers, but I can probably make that PR next week as well.

One potential issue is that our entmax code does not support python 2. I don't know who still needs python 2 support for OpenNMT.

vince62s · 2019-08-24T08:46:33Z

Hi Ben, welcome back.
Up to now, we tried to make the code python2 compatible (which is a requirement in Travis as you can see). I do understand it is a bit obsolete (plus python3 is requirement for distributed training) but is there much to do to make it compatible?

bpopeters · 2019-08-24T09:55:12Z

It probably would not require very many changes, but it isn't really on our agenda since python 2 is only supported until the end of the year.

bpopeters added 4 commits August 23, 2019 16:04

add entmax 1.5 attention

466a8cf

fix unclear variable names

9d18618

add entmax 1.5 loss

29c6499

update requirements.txt

a585a6b

vince62s changed the title ~~entmax 1.5 for attention and outputs, faster implementation of sparsemax~~ [WIP] entmax 1.5 for attention and outputs, faster implementation of sparsemax Sep 4, 2019

vince62s changed the title ~~[WIP] entmax 1.5 for attention and outputs, faster implementation of sparsemax~~ [WIP v1 - deprecated] entmax 1.5 for attention and outputs, faster implementation of sparsemax Jan 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP v1 - deprecated] entmax 1.5 for attention and outputs, faster implementation of sparsemax #1541

[WIP v1 - deprecated] entmax 1.5 for attention and outputs, faster implementation of sparsemax #1541

bpopeters commented Aug 23, 2019

vince62s commented Aug 24, 2019

bpopeters commented Aug 24, 2019

[WIP v1 - deprecated] entmax 1.5 for attention and outputs, faster implementation of sparsemax #1541

Are you sure you want to change the base?

[WIP v1 - deprecated] entmax 1.5 for attention and outputs, faster implementation of sparsemax #1541

Conversation

bpopeters commented Aug 23, 2019

vince62s commented Aug 24, 2019

bpopeters commented Aug 24, 2019