`direct_input_pertubation_strategy=` isn't passed down #3

afiodorov · 2017-03-12T14:17:15Z

Nice method.

I am examining the code and the thesis more closely as it appears to be very useful.

I don't fully understand the point of perturbation strategy and it's not fully expanded on in the thesis.

I started reading the code and I spotted some bugs.

Firstly

https://github.com/adebayoj/fairml/blob/master/fairml/orthogonal_projection.py#L120

takes the strategy but ignores it, see:

https://github.com/adebayoj/fairml/blob/master/fairml/orthogonal_projection.py#L217

Also, I think that with constant_zero and median perturbation strategies this loop is redundant:

https://github.com/adebayoj/fairml/blob/master/fairml/orthogonal_projection.py#L205

As each run ignores random_sample_selected anyway, so each run should produce the same output_difference_col and total_difference. (because data_col_ptb and total_ptb_data are identical each run).

Finally, it would be great if you could explain more in the documentation the purpose of direct_input_pertubation_strategy. Is it necessary at all to "zero-out" a column? Why?

It appears to me that just by orthogonalising other columns you already take away the effect of the subject column. Not clear to me why zero'ing out is required on top. Is it to be certain the effect of the column is not present?

Many thanks for the code by the way!

The text was updated successfully, but these errors were encountered:

adebayoj · 2017-03-13T20:06:19Z

@afiodorov thanks for taking a look at the code and for your feedback.

you are right. with the constant-zero and median strategies, the loop is redundant. I plan on separating the direct perturbation code from overall method to make it so that the run is independent. I am testing a local branch atm that handles this. Will push a fix up later this wk.
The purpose of direction perturbation. This is also a good question, and you are right, it is not explained in the thesis. We are working on suitable documentation to fully explain the overall issue.

For now here is the justification for including direction perturbation: if you have a function f(x_1, x_2, x_2). What, fairml does on a high level is to give you the dependence of f on each of the x_i. Now the dependence is calculated as direct influence + indirect influence. For direct influence, we generate a data transformation using any of the different direct perturbation strategies and then look at the impact of the black-box function on that perturbation. For the indirect influence, we use orthogonal transformation to generate those transformations.

Certainly, we could just use orthogonal transformation on all variables including, but wanted to give people flexibility to pick whatever function that they are interested in using for this task. Hope this helps explain the use of the direct-perturbation strategy requirement.

adebayoj self-assigned this Mar 13, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`direct_input_pertubation_strategy=` isn't passed down #3

`direct_input_pertubation_strategy=` isn't passed down #3

afiodorov commented Mar 12, 2017 •

edited

Loading

adebayoj commented Mar 13, 2017

direct_input_pertubation_strategy= isn't passed down #3

direct_input_pertubation_strategy= isn't passed down #3

Comments

afiodorov commented Mar 12, 2017 • edited Loading

adebayoj commented Mar 13, 2017

`direct_input_pertubation_strategy=` isn't passed down #3

`direct_input_pertubation_strategy=` isn't passed down #3

afiodorov commented Mar 12, 2017 •

edited

Loading