You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In first iteration grad_i is update to proj(grad_i), but it is not updated in grads_task list, so that in next iteration, we compute proj(grad_j), grad_i in taken into account rather than proj(grad_i), is it reasonable ?
The text was updated successfully, but these errors were encountered:
This code seems to align with pseudo code in the paper (not changing the gradient in-place). Not sure why they opt to not change the gradient in-place tho. Maybe because the fact that the last gradient in the outer loop would not be projected at all as all other gradients has been align with the last one already. This could have side effect that the model would train slower (the gradients don't go directly the way they're supposed to) but they stated in the paper that this doesn't seem to happen in practice.
PCGrad/PCGrad_tf.py
Line 51 in c5fbd7c
In first iteration grad_i is update to proj(grad_i), but it is not updated in grads_task list, so that in next iteration, we compute proj(grad_j), grad_i in taken into account rather than proj(grad_i), is it reasonable ?
The text was updated successfully, but these errors were encountered: