You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Model get_gradients code was changed to accept optional examples, for cases where the raw examples are needed to calculate gradients (such as BERT models). All our current demos don't use the optional examples provided, so there is no effect of the bug described below on our existing demo uses.
But, the examples provided are not the correct examples that align with the activations provided, so if someone were to use the examples in get_gradients, they would get incorrect calculations.
The root case is that the activations are generated with a shuffled set of concept examples, and then a different shuffled set of concept examples are loaded (since get_examples_for_concept shuffles by default) for passing to get_gradients (because the initial set used to calculate the activations isn't saved anywhere currently).
Hi James, can you clarify why BERT models need the raw examples in order to calculate gradients rather than just the activations from a given bottleneck?
I think the BERT adoption was written by some folks at Google (tho maybe @jameswex is referring to some external case?), and I am not sure why-my best guess is that they wanted to investigate a directional derivative of a particular example (to use it for some other purpose). I could be wrong.
Model get_gradients code was changed to accept optional examples, for cases where the raw examples are needed to calculate gradients (such as BERT models). All our current demos don't use the optional examples provided, so there is no effect of the bug described below on our existing demo uses.
But, the examples provided are not the correct examples that align with the activations provided, so if someone were to use the examples in get_gradients, they would get incorrect calculations.
The root case is that the activations are generated with a shuffled set of concept examples, and then a different shuffled set of concept examples are loaded (since get_examples_for_concept shuffles by default) for passing to get_gradients (because the initial set used to calculate the activations isn't saved anywhere currently).
@BeenKim FYI
The text was updated successfully, but these errors were encountered: