Implementation details #2

clementabary · 2020-03-09T09:23:38Z

Thanks a lot for the great work.

I still have a few questions on implementation details.
First, what is the reason for partitioning the training procedure with powers of 2 ?

Second, I am confused with normalization. For the source dataset, you use the maximum absolute value normalization while for the sample dataset you use a scaling with lambda x: 10 * 1.0 / x.pow(2).sum().sqrt(). Can you give more insight on this choice ?

Third, why did you choose to pad your sequences with small noise rather than zeros ?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementation details #2

Implementation details #2

clementabary commented Mar 9, 2020

Implementation details #2

Implementation details #2

Comments

clementabary commented Mar 9, 2020