Skip to content

v0.3.3: Attention attribution, new aggregation, improved saving/reloading and more

Compare
Choose a tag to compare
@github-actions github-actions released this 20 Jan 12:42
· 118 commits to refs/heads/main since this release
bbac850

What’s Changed

Attention attribution (#148 )

This release introduces a new category of attention attribution methods and adds support for AttentionAttribution (id: attention). This method attributes the generated outputs using raw attention weights extracted during the forward pass, as it was done inter alia by Jain and Wallace, 2019. The parameters heads and layers enable the choice of a single element (single int), a range (with a tuple (start_idx, end_idx)) or a set of custom valid indices (as [idx_1, idx_2, ...]) for attention heads and model layers respectively. The aggregation of multiple heads or layers can be performed using one of the default aggregators (e.g. max, average) or by defining a custom function and passing it to aggregate_heads_fn or aggregate_layers_fn in the call of model.attribute().

Example of default usage:

import inseq

model = inseq.load_model("facebook/wmt19-en-de", "attention")
out = model.attribute("The developer argued with the designer because her idea cannot be implemented.")

The default behavior is set to minimize unnecessary parameter definitions. In the default case above, the result is the average across all attention heads of the final layer.

Example of advanced usage:

import inseq

model = inseq.load_model("facebook/wmt19-en-de", "attention")
out = model.attribute(
	"The developer argued with the designer because her idea cannot be implemented.",
	layers=(0, 5),
	heads=[0, 2, 5, 7],
	aggregate_heads_fn = "max"
)

In the case above, the outcome is a matrix of maximum attention weights of heads 0, 2, 5 and 7 after averaging their weights across the first 5 layers of the model.

Other attention methods will be added in the upcoming releases (see summary issue #108 )

L2 + Normalize default aggregation (#157)

Starting from this release, the default aggregation adopted to aggregate attribution scores at a token level for GradientFeatureAttributionSequenceOutput objects is the L2 norm of the tensor over the hidden_size dimension, followed by a step-wise normalization of the attributions (all attributions across source and target at every generation step will sum to one). This replaces the previous normalization approach, which was a simple sum over the hidden dimension followed by a division by the norm of the step attribution vector. Importantly, since the L2 norm is guaranteed to be a positive value, the resulting attribution scores will now always be positive (also for integrated_gradients).

Motivations:

  • Good empirical faithfulness of such aggregation procedure on transformer-based models shown by Bastings et al. 2022
  • Improved understanding of the individual contribution of every input to the generation of the output by means of positivity and normalization.

Improved saving and reloading of attributions (#157)

When saving attribution outputs, now it is possible to obtain one file per sequence by specifying split_sequences=True, and to automatically zip the generated outputs with compress=True.

import inseq

model = inseq.load_model("Helsinki-NLP/opus-mt-en-it", "saliency")
out = model.attribute(["sequence one", "sequence number two"])
# Creates out_0.json.gz, out_1.json.gz
out.save("out.json.gz", split_sequences=True, compress=True)

Export attributions for usage with pandas (#157)

The new method FeatureAttributionOutput.get_scores_dicts allows to export source_attributions, target_attributions and step_scores as dictionaries that can be easily loaded into pd.DataFrame objects for further analysis (thanks @MoritzLaurer for raising the issue!). Example usage:

import inseq
import pandas as pd

model = inseq.load_model("Helsinki-NLP/opus-mt-en-it", "saliency")
out = model.attribute(
    ["Hello ladies and badgers!", "This is a test input"], attribute_target=True, step_scores=["probability", "entropy"]
)
# A list of dataframes (one per sequence) corresponding to source matrices in out.show
dfs = [pd.DataFrame(x["source_attributions"]) for x in out.get_scores_dicts()]
# A list of dataframes (one per sequence) corresponding to target matrices in out.show
dfs = [pd.DataFrame(x["target_attributions"]) for x in out.get_scores_dicts()]
# A list of dataframes (one per sequence) with step scores ids as rows and generated target tokens as columns
dfs = [pd.DataFrame(x["step_scores"]) for x in out.get_scores_dicts()]

ruff for style and quality checks (#159)

From this release Inseq drops flake8, isort, pylint and pyupgrade linting and moves to ruff with the corresponding extensions for style and quality checks. This allows to dramatically speed up build checks (from ~4 minutes to <1 second). Library developers are advised to integrate ruff in their automatic checks during coding (VSCode extension and Pycharm plugin are available)

All Merged PRs

🚀 Features

🔧 Fixes & Refactoring

📝 Documentation

👥 List of contributors

@gsarti and @lsickert