-
Notifications
You must be signed in to change notification settings - Fork 393
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possibility to Return top K positives and top K negatives for LOCO #264
Conversation
Codecov Report
@@ Coverage Diff @@
## master #264 +/- ##
==========================================
+ Coverage 86.63% 86.66% +0.02%
==========================================
Files 315 315
Lines 10366 10396 +30
Branches 336 556 +220
==========================================
+ Hits 8981 9010 +29
- Misses 1385 1386 +1
Continue to review full report at Codecov.
|
* derived features based on the LOCO insight. | ||
* - If Abs, returns at most topK elements : the topK derived features having highest absolute value of LOCO score. | ||
* @param model model instance that you wish to explain | ||
* @param uid uid for instance | ||
*/ | ||
@Experimental |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we remove the @Experimental
flag already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no - we want to keep it so we can change things :-) For instance the way of creating this class is still terrible
"Classification and Regression." | ||
) | ||
|
||
def setTopKStrategy(strat: TopKStrategy): this.type = set(topKStrategy, strat.entryName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please name fully - strategy: TopKStrategy
case 1 => ProblemType.Regression | ||
case 2 => ProblemType.BinaryClassification | ||
case n if (n > 2) => { | ||
log.info("MultiClassification Problem : Top K LOCOs by absolute value") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a very user friendly message. Let's log some meaningful message for any problem type with some clear and comprehensible explanation.
featureArray.update(i, (oldInd, oldVal)) | ||
i += 1 | ||
val top = $(topKStrategy) match { | ||
case s if s == TopKStrategy.Abs.entryName || problemType == ProblemType.MultiClassification => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a getter method for strategy def getTopKStrategy: TopKStrategy
and then use it here instead of string equality: $(topKStrategy) -> problemType match { case (TopKStrategy.Abs, ProblemType.MultiClassification) => ... }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please also create a function for each of the cases to avoid this method growing large
var positiveCount = 0 | ||
// Size of negative heap | ||
var negativeCount = 0 | ||
for {i <- 0 until filledSize} { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use while
- it's faster
if (positiveCount > k) { // remove the lowest element if the heap size goes from 5 to 6 | ||
positiveMaxHeap.dequeue() | ||
} | ||
} else if (max < 0.0) { // if negative LOCO then add it to negative heap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about max == 0.0
? it's not handled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. LOCOs of 0.0 are not interesting
for {i <- 0 until filledSize} { | ||
val (oldInd, oldVal) = featureArray(i) | ||
val diffs = computeDiffs(i, oldInd, featureArray, featureSize, baseScore) | ||
val max = if (problemType == ProblemType.Regression) diffs(0) else diffs(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems very hacky and error prone - if (problemType == ProblemType.Regression) diffs(0) else diffs(1)
, perhaps add a helper function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is where you would use the indexToExamine variable I suggested above
val parsed = insights.collect(name, transformer.getOutput()) | ||
.map { case (n, i) => n -> RecordInsightsParser.parseInsights(i) } | ||
parsed.foreach { case (_, in) => | ||
in.head._1.columnName == "1_1_1_1" || in.last._1.columnName == "3_3_3_3" shouldBe true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does this even mean in.head._1.columnName == "1_1_1_1" || in.last._1.columnName == "3_3_3_3"
?
please add a withClue
around it to allow easier debugging in case of failures.
val label = labelNoRes.copy(isResponse = true) | ||
val testDataMeta = addMetaData(testData, "features", 5) | ||
val sparkModel = new OpLogisticRegression().setInput(label, featureVector).fit(testData) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove redundant lines
case 0 => ProblemType.Unknown | ||
case 1 => ProblemType.Regression | ||
case 2 => ProblemType.BinaryClassification | ||
case n if (n > 2) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about the multiclass issue - I think it makes sense to return the difference in score for the winning category so maybe instead of having this find the problem type it should return the index to examine (also move it down to the base score function line 117:
val baseResult = modelApply(labelDummy, features)
val baseScore = baseResult.score
val indexToExamine = baseScore.length match {
case 0 => throw new RuntimeExeption("model does not produce scores for insights")
case 1 => 0
case 2 => 1
case n if (n > 2) => baseResult.prediction
and then always use this index for pulling out the diff for the heaps
topAbs.sortBy { case (_, v, _) => -math.abs(v) } | ||
|
||
} | ||
case s if s == TopKStrategy.PositiveNegative.entryName => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if we always kept the positive and negative heap and then just check the strategy when we return the insights?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that! This would keep the code simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be slower when we want to return the top by abs value. Is there an easy way to go from 2 Priority Queues to 1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be slightly slower but not by much if we assume that (or enforce that) k is small. To go to one q - just get all the answers and sort them. Again you are right that it is slower but if we limit k to ~100 it is a pretty small difference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or you can get the first of each que and then basically keep popping off the last q that you took from for comparison - then it is not much slower you just have a sightly larger memory footprint
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another issue @leahmcguire. For Multiclassification, we return the highest absolute LOCO value, and for the other strategy we look at the winning category (baseScore). Getting the top Positive and negative for winning class doesn't guarantee the highest absolute value : what if this absolute value was from another class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah - I think that the fact that we were just taking the highest absolute value was a mistake. It allows for things like baseScore = (0.1, 0.3, 0.6) new score = (0.3, 0.1, 0.6) to be the difference we report, which seems wrong. Always choosing the change in the winning value will mean that the change we report always has to do with the score that won, even if it is not the largest change in score. So yes we would change strategy for multiclass in absolute value as well - but I think it is slightly more principled...WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now the question is should we apply the same logic for Binary Classification.
So far the spikes were focused on the positive class (label 1.0) because we thought it was the the most interesting info to report.
However, recent spikes witness a strong relationship between LOCO scores and prediction scores (cc @crupley) : Rows with the "highest scores" had on average its top absolute LOCOs being mostly positive. For "lowest scores", those locos are mostly negative.
One can also argue that LOCO for class 0 = - LOCO for class 1
, hence getting top Positives and Negatives is sufficient.
Maybe something we might need to add to the output is the base prediciton.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, that is interesting. But wouldn't it be confusing to interpret for Binary if we choose the winning value because they are symetric? the scores will always some to 1 so a positive contribution to the prediction of 1 will be the same negative contribution to the prediction of 0. If we switch based on the prediction they will have to keep in mind the winning prediction to interpret the effect - this is something you need to do in multiclass but seems like an extra complication in binary.
…ransmogrifAI into mw/LOCO-improvement
while (i < filledSize) { | ||
val (oldInd, oldVal) = featureArray(i) | ||
val diffs = computeDiffs(i, oldInd, featureArray, featureSize, baseScore) | ||
val max = diffs(indexToExamine) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe rename to diffToExamine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just minor stuff then LGTM
negativeMaxHeap.dequeue() | ||
} // Not keeping LOCOs with value 0 | ||
} | ||
featureArray.update(i, (oldInd, oldVal)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it make sense to move the update of the array back tot the old version into the compute diffs function?
override def transformFn: OPVector => TextMap = (features) => { | ||
val baseResult = modelApply(labelDummy, features) | ||
val baseScore = baseResult.score | ||
modelApply(labelDummy, features).prediction |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
think line 148 is leftover from a refactor
case n if (n > 2) => baseResult.prediction.toInt | ||
} | ||
val topPosNeg = returnTopPosNeg(filledSize, featureArray, featureSize, baseScore, k, indexToExamine) | ||
val top = getTopKStrategy match { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getTopKStrategy match {
case TopKStrategy.Abs =>
case TopKStrategy.PositiveNegative =>
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, one minor comment on match statement
hello!How does the multi-category evaluator scale to compute roc,pr, and FPR and TPR for each category? |
Describe the proposed solution
New feature used only for Binary Classification (probability of predicting a positive label) and Regression.
The map's contents are different regarding the value of the topKStrategy param :
If PositiveNegative, returns at most 2 x topK elements : the topK most positive and the topK most negative derived features based on the LOCO insight.
If Abs, returns at most topK elements : the topK derived features having the highest absolute value of LOCO score.