Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

grad update issue #12

Open
neil-yc opened this issue May 11, 2021 · 1 comment
Open

grad update issue #12

neil-yc opened this issue May 11, 2021 · 1 comment

Comments

@neil-yc
Copy link

neil-yc commented May 11, 2021

proj_grads_flatten = tf.vectorized_map(proj_grad, grads_task)

In first iteration grad_i is update to proj(grad_i), but it is not updated in grads_task list, so that in next iteration, we compute proj(grad_j), grad_i in taken into account rather than proj(grad_i), is it reasonable ?

@51616
Copy link

51616 commented Oct 26, 2021

This code seems to align with pseudo code in the paper (not changing the gradient in-place). Not sure why they opt to not change the gradient in-place tho. Maybe because the fact that the last gradient in the outer loop would not be projected at all as all other gradients has been align with the last one already. This could have side effect that the model would train slower (the gradients don't go directly the way they're supposed to) but they stated in the paper that this doesn't seem to happen in practice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants