Incorrect handling of scale in `Loss.grad` #468

bwohlberg · 2023-11-09T04:23:12Z

There is a bug in Loss.grad handling of the scale attribute, but only when it's set via scalar multiplication:

import jax
from scico.loss import SquaredL2Loss
from scico.functional import L2Norm
import scico.numpy as snp

f = SquaredL2Loss(y=snp.zeros((4,)))
g = SquaredL2Loss(y=snp.zeros((4,)), scale=5)
h = 10 * f

# __call__ is correct
f(snp.ones((4,)))
>> Array(2., dtype=float32)
g(snp.ones((4,)))
>> Array(20., dtype=float32)
h(snp.ones((4,)))
>> Array(20., dtype=float32)

# __grad__ is broken
f.grad(snp.ones((4,)))
>> Array([1., 1., 1., 1.], dtype=float32)
g.grad(snp.ones((4,)))
>> Array([10., 10., 10., 10.], dtype=float32)
h.grad(snp.ones((4,)))
>> Array([1., 1., 1., 1.], dtype=float32

The same bug is not present in Functional.grad:

f = L2Norm()
g = 10 * f

f.grad(snp.ones((4,)))
>> Array([0.5, 0.5, 0.5, 0.5], dtype=float32)
g.grad(snp.ones((4,)))
>> Array([5., 5., 5., 5.], dtype=float32)

The text was updated successfully, but these errors were encountered:

bwohlberg · 2023-11-14T14:19:34Z

The bug turns out to be due to a combination of this

scico/scico/functional/_functional.py

Lines 35 to 36 in 5ffd1f9

    
           def __init__(self): 
        
               self._grad = scico.grad(self.__call__)

and this

scico/scico/loss.py

Lines 126 to 129 in 5ffd1f9

    
           def __mul__(self, other): 
        
               new_loss = copy(self) 
        
               new_loss.set_scale(self.scale * other) 
        
               return new_loss

The copy call does not result in an __init__ call, so the new Loss object ends up with _grad set to the function that was originally constructed when __init__ was called for the "original", unscaled Loss object.

PR #470 has a simple fix, but this issue raises a few broader design questions:

Is there any value in initializing a _grad attribute of Functional objects rather than simply defining their grad method as directly computing the gradient from __call__?
Would the Loss implementation not be at least slightly simpler if it were derived from ScaledFunctional rather than Functional?

* Update change log * Resolve #468 and add corresponding test * Shorten comment * Resolve some oversights in prox definitions * Minor edit * Avoid chaining of ScaledFunctional and some code re-organization * Address review comment

bwohlberg added the bug Something isn't working label Nov 9, 2023

bwohlberg assigned bwohlberg and Michael-T-McCann Nov 9, 2023

bwohlberg added a commit that referenced this issue Nov 14, 2023

Resolve #468 and add corresponding test

6faf02c

bwohlberg mentioned this issue Nov 14, 2023

Resolve #468 #470

Merged

bwohlberg closed this as completed in #470 Nov 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incorrect handling of scale in `Loss.grad` #468

Incorrect handling of scale in `Loss.grad` #468

bwohlberg commented Nov 9, 2023 •

edited

Loading

bwohlberg commented Nov 14, 2023

Incorrect handling of scale in Loss.grad #468

Incorrect handling of scale in Loss.grad #468

Comments

bwohlberg commented Nov 9, 2023 • edited Loading

bwohlberg commented Nov 14, 2023

Incorrect handling of scale in `Loss.grad` #468

Incorrect handling of scale in `Loss.grad` #468

bwohlberg commented Nov 9, 2023 •

edited

Loading