DSEM grows as layers advances, but is significantly smaller for activation functions than for feature maps outputs
Simpleconv with kernel 7x7 not learning mnist r16
- It seems with r8 it learns slowly, and its normal for r2/r0

Provide feedback

Saved searches