Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjoint variables kept around longer than necessary #84

Open
dham opened this issue Jun 9, 2022 · 0 comments
Open

Adjoint variables kept around longer than necessary #84

dham opened this issue Jun 9, 2022 · 0 comments
Labels
enhancement New feature or request pyadjoint Issue related to pyadjoint core.

Comments

@dham
Copy link
Member

dham commented Jun 9, 2022

As a brief reminder of how the maths works, suppose we have the following sequence:
$$u_0 = m$$
$$u_1 = f_0(u_0)$$
$$u_2 = f_1(u_1)$$
$$u_3 = f_2(u_1)$$
$$J = f_3(u_2, u_3)$$
Then we get a block on the tape corresponding to each $f_0,\ldots,f_3$ and the adjoint calculation does the following:
$$u'_2, u'_3 = f'_3(u_2, u_3, J)$$
$$u'_1 = f'_2(u_1, u_3, u'_3)$$
$$u'_1 =u'_1+ f'_1(u_1, u_2, u'_2)$$
$$u'_0 = f'_0(u_0, u_1, u'_1)$$
$$m' = u'_0$$
where primes indicate adjoint operations and variables. Let's observe a few patterns:

  1. If $u$ is an input to $f$ then $f'$ adds to $u'$. Multiple adjoint operations can contribute to the same adjoint variable.
  2. $u'$ is an input to $f'$ if and only if the corresponding primal variable $u$ was an output of the primal operation $f$. Consequently, each adjoint variable is only the input to one adjoint operation.

This means that when we are evaluating the adjoint, and assuming we don't then intend to evaluate the Hessian, we can discard the adjoint values to block outputs as soon as the adjoint block has been evaluated. Because we do need to keep the adjoint values lying around if we plan to evaluate the Hessian, this would need to be controlled by an option e.g. "keep_adjoint_variables=True".

This would happen at the end of block.evaluate_adj and would do something like:

if not keep_adjoint_variables:
    for output in outputs:
        output.reset_variables("adjoint")

Note that this should happen even if there are no relevant dependencies, so rather than returning early if there are no relevant dependencies, we should just not call the preparation routine:

        if relevant_dependencies:
            prepared = self.prepare_evaluate_adj(inputs, adj_inputs, relevant_dependencies)
@dham dham added enhancement New feature or request pyadjoint Issue related to pyadjoint core. labels Jun 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request pyadjoint Issue related to pyadjoint core.
Projects
None yet
Development

No branches or pull requests

1 participant