Summarize operators used in ConvS2S #7312

lcy-seso · 2018-01-08T11:34:48Z

Here I summarize operators will be used in ConvS2S:

positional embedding
- look_up_table but has to support padding_idx : Support padding_idx in the lookup_table_op. #7309
- addition
convolution block structure: one-dimensional convolution followed by a GLU.

Is it necessary to implement GLU in one operator to optimize the time efficiency. This can be determined later.
- sequence convolution
  - 2D convolution: sequence_conv_op
- GLU
  - offset operator ?? (To be determined later)
  - sigmoid
  - element-wise multiplication
  - addition
- attention
  - matmul_op (the batched matrix multiplication)
  - softmax along the specified axis
  - reshape op
  - softmax
- weight normalization

The missing operator:

Need to be enhanced: (This enhancement is also needed by both Transformer and ConvS2S)

lcy-seso added the NMT label Jan 8, 2018

lcy-seso changed the title ~~Operators used in ConvS2S~~ Summarize operators used in ConvS2S Jan 8, 2018

lcy-seso closed this as completed Jan 12, 2018

Provide feedback