Skip to content

Commit

Permalink
changed if man file
Browse files Browse the repository at this point in the history
  • Loading branch information
infinite-pursuits committed Oct 8, 2024
1 parent 5121992 commit 7b6b63a
Showing 1 changed file with 8 additions and 4 deletions.
12 changes: 8 additions & 4 deletions _posts/2024-10-07-ifman.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,13 +56,17 @@ $\min_{\theta^{\prime}:\rm {dist} (\theta^*, \theta^{\prime}) \leq C} \ell_{\rm
When the target set $Z_{\rm {target}} \subset Z$ consists of more than 1 sample, we can simply re-apply the above attack multiple times, albeit on different samples. The primary challenge with these attacks is that calculating gradients of influence-based loss objectives is highly computationally infeasible due to backpropagation through hessian-inverse-vector-products. We address this challenge with a simple memory-time efficient and backward-friendly algorithm to compute the gradients while using existing PyTorch machinery for implementation.
This contribution is of independent technical interest, as the literature has only focused on making forward computation of influence functions feasible, while we study techniques to make the \textit{backward pass} viable. Our algorithm brings down the memory required for one forward $+$ backward pass from not being feasible to run on a 12GB GPU to 7GB for a 206K parameter model and from 8GB to 1.7GB for a 5K model.

#### Fairness Certification in-the-clear
#### Experimental Results

The fairness metric we use is Local Individual Fairness (IF) and give a simple algorithm to calculate this certificate by using a connection between adversarial robustness and IF. Experimentally, we see that the resulting certification algorithm is able to differentiate between less and more fair models.
All our experiments are on multi-class logistic regression models trained on ResNet50 embeddings for standard vision datasets. Our results are as follows.

1. **Our Single-Target attack performs better than a non-influence Baseline.** Consider a non-influence baseline attack for increasing the importance of a training sample : reweigh the training loss, with a high weight on the loss for the target sample. Our attack has a significantly
higher success rate as compared to the baseline with a much smaller accuracy drop under all
settings, as shown in the table below.

<div class='l-body' align="center">
<img class="img-fluid rounded z-depth-1" src="{{ site.baseurl }}/assets/img/2024-07-fairproof/fair-unfair.png">
<figcaption style="text-align: center; margin-top: 10px; margin-bottom: 10px;"> Histogram of fairness parameter for fair and unfair models for 100 randomly sampled data points. Fairness parameter values are higher for more fair models.</figcaption>
<img class="img-fluid rounded z-depth-1" src="{{ site.baseurl }}/assets/img/2024-10-ifman/baselinevsours.png">
<figcaption style="text-align: center; margin-top: 10px; margin-bottom: 10px;"> Success Rates of the Baseline vs. our Single-Target Attack for Data Valuation. $k$ is the ranking, as in top- $k$. ${\small \Delta_{\rm acc}}:= \small \rm TestAcc(\theta^*) - \small \rm TestAcc(\theta^\prime)$ represents drop in test accuracy for manipulated model $\theta^\prime$. Two success rates are reported : (1) when $\small \Delta_{\rm acc} \leq 3\%$ (2) the best success rate irrespective of accuracy drop. (\%) represents model accuracy. (-) means a model with non-zero success rate could not be found \& hence accuracy can't be stated. *Our attack has a significantly higher success rate as compared to the baseline with a much smaller accuracy drop under all settings.*</figcaption>
</div>

#### ZKP for Fairness Certification
Expand Down

0 comments on commit 7b6b63a

Please sign in to comment.