Force DenseWithSparseWeights to have at least one entry per row #7999

JEM-Mosig · 2021-02-19T17:01:14Z

Description of Problem:

Whenever DenseWithSparseWeights has completely zero weight rows (which can happen if sparsity is high), it effectively reduces the layer size, which is pointless. So we should prevent this from happening.

Overview of the Solution:

Examples (if relevant):

Blockers (if relevant):

Definition of Done:

JEM-Mosig · 2021-03-10T09:40:40Z

It might be even better to use LocallyConnected1D layers. Treating this as the same issue.

JEM-Mosig · 2021-03-17T09:17:44Z

LocallyConnectedDense layers are slow and buggy (implementation!=1 doesn't work) in current Tensorflow version, so I backtracked on those changes. We're now using our own RandomlyConnectedDense layer.

I'm also forcing every input to be connected to at least one output, because it doesn't make sense to ignore inputs randomly.

JEM-Mosig · 2021-03-17T09:19:53Z

changes to the sparse layers definitely stabilize TED's performance at low densities:

Solid lines show performance at fixed density (see legend) with my modified sparse layers. Dash-dotted lines show the equivalent on the main branch.

The solid green curve is the fully dense layers one. The orange curves are the 20% density (or 80% sparsity) which are our default - for those my changes don't do anything.

The dash-dotted red curve is further to the left, because on main sparse layers are allowed to drop inputs or outputs and thus can have fewer trainable weights.

JEM-Mosig added type:enhancement ✨ Additions of new features or changes to existing ones, should be doable in a single PR area:rasa-oss 🎡 Anything related to the open source Rasa framework labels Feb 19, 2021

JEM-Mosig self-assigned this Feb 19, 2021

JEM-Mosig mentioned this issue Feb 22, 2021

Force DenseWithSparseWeights to produce dense output and use all inputs #8011

Merged

3 tasks

JEM-Mosig closed this as completed in #8011 Apr 29, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Force DenseWithSparseWeights to have at least one entry per row #7999

Force DenseWithSparseWeights to have at least one entry per row #7999

JEM-Mosig commented Feb 19, 2021

JEM-Mosig commented Mar 10, 2021

JEM-Mosig commented Mar 17, 2021 •

edited

Loading

JEM-Mosig commented Mar 17, 2021

Force DenseWithSparseWeights to have at least one entry per row #7999

Force DenseWithSparseWeights to have at least one entry per row #7999

Comments

JEM-Mosig commented Feb 19, 2021

JEM-Mosig commented Mar 10, 2021

JEM-Mosig commented Mar 17, 2021 • edited Loading

JEM-Mosig commented Mar 17, 2021

JEM-Mosig commented Mar 17, 2021 •

edited

Loading