Support petals
distributed model classes
#205
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This PR adds preliminary support for the
AutoDistributedModelForCausalLM
class from thepetals
library. Concretely, it supports bypassing the usage ofattention_mask
inmodel.generate
andmodel.forward
, while support for attention information inpetals
is still WIP (see #158 for reference).Usage
The following code snippet showcases an example of contrastive attribution using the Input X Gradient method and the Llama 65B model (tested on a 6GB RTX3060 machine):
Notes
Method requiring model internals (e.g.
attention
) are currently not supported and will raise exceptions if used alongsideAutoDistributedModelForCausalLM
classes.