You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As a brief reminder of how the maths works, suppose we have the following sequence: $$u_0 = m$$ $$u_1 = f_0(u_0)$$ $$u_2 = f_1(u_1)$$ $$u_3 = f_2(u_1)$$ $$J = f_3(u_2, u_3)$$
Then we get a block on the tape corresponding to each $f_0,\ldots,f_3$ and the adjoint calculation does the following: $$u'_2, u'_3 = f'_3(u_2, u_3, J)$$ $$u'_1 = f'_2(u_1, u_3, u'_3)$$ $$u'_1 =u'_1+ f'_1(u_1, u_2, u'_2)$$ $$u'_0 = f'_0(u_0, u_1, u'_1)$$ $$m' = u'_0$$
where primes indicate adjoint operations and variables. Let's observe a few patterns:
If $u$ is an input to $f$ then $f'$ adds to $u'$. Multiple adjoint operations can contribute to the same adjoint variable.
$u'$ is an input to $f'$ if and only if the corresponding primal variable $u$ was an output of the primal operation $f$. Consequently, each adjoint variable is only the input to one adjoint operation.
This means that when we are evaluating the adjoint, and assuming we don't then intend to evaluate the Hessian, we can discard the adjoint values to block outputs as soon as the adjoint block has been evaluated. Because we do need to keep the adjoint values lying around if we plan to evaluate the Hessian, this would need to be controlled by an option e.g. "keep_adjoint_variables=True".
This would happen at the end of block.evaluate_adj and would do something like:
if not keep_adjoint_variables:
for output in outputs:
output.reset_variables("adjoint")
Note that this should happen even if there are no relevant dependencies, so rather than returning early if there are no relevant dependencies, we should just not call the preparation routine:
if relevant_dependencies:
prepared = self.prepare_evaluate_adj(inputs, adj_inputs, relevant_dependencies)
The text was updated successfully, but these errors were encountered:
As a brief reminder of how the maths works, suppose we have the following sequence:
$$u_0 = m$$
$$u_1 = f_0(u_0)$$
$$u_2 = f_1(u_1)$$
$$u_3 = f_2(u_1)$$
$$J = f_3(u_2, u_3)$$ $f_0,\ldots,f_3$ and the adjoint calculation does the following:
$$u'_2, u'_3 = f'_3(u_2, u_3, J)$$
$$u'_1 = f'_2(u_1, u_3, u'_3)$$
$$u'_1 =u'_1+ f'_1(u_1, u_2, u'_2)$$
$$u'_0 = f'_0(u_0, u_1, u'_1)$$
$$m' = u'_0$$
Then we get a block on the tape corresponding to each
where primes indicate adjoint operations and variables. Let's observe a few patterns:
This means that when we are evaluating the adjoint, and assuming we don't then intend to evaluate the Hessian, we can discard the adjoint values to block outputs as soon as the adjoint block has been evaluated. Because we do need to keep the adjoint values lying around if we plan to evaluate the Hessian, this would need to be controlled by an option e.g. "keep_adjoint_variables=True".
This would happen at the end of
block.evaluate_adj
and would do something like:Note that this should happen even if there are no relevant dependencies, so rather than returning early if there are no relevant dependencies, we should just not call the preparation routine:
The text was updated successfully, but these errors were encountered: