Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possibility to Return top K positives and top K negatives for LOCO #264

Merged
merged 17 commits into from
Apr 5, 2019

Conversation

michaelweilsalesforce
Copy link
Contributor

Describe the proposed solution

New feature used only for Binary Classification (probability of predicting a positive label) and Regression.

The map's contents are different regarding the value of the topKStrategy param :
If PositiveNegative, returns at most 2 x topK elements : the topK most positive and the topK most negative derived features based on the LOCO insight.
If Abs, returns at most topK elements : the topK derived features having the highest absolute value of LOCO score.

@codecov
Copy link

codecov bot commented Apr 3, 2019

Codecov Report

Merging #264 into master will increase coverage by 0.02%.
The diff coverage is 97.82%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #264      +/-   ##
==========================================
+ Coverage   86.63%   86.66%   +0.02%     
==========================================
  Files         315      315              
  Lines       10366    10396      +30     
  Branches      336      556     +220     
==========================================
+ Hits         8981     9010      +29     
- Misses       1385     1386       +1
Impacted Files Coverage Δ
...e/op/stages/impl/insights/RecordInsightsLOCO.scala 95.31% <97.82%> (+1.19%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 209647c...a137d61. Read the comment docs.

* derived features based on the LOCO insight.
* - If Abs, returns at most topK elements : the topK derived features having highest absolute value of LOCO score.
* @param model model instance that you wish to explain
* @param uid uid for instance
*/
@Experimental
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we remove the @Experimental flag already?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no - we want to keep it so we can change things :-) For instance the way of creating this class is still terrible

"Classification and Regression."
)

def setTopKStrategy(strat: TopKStrategy): this.type = set(topKStrategy, strat.entryName)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please name fully - strategy: TopKStrategy

case 1 => ProblemType.Regression
case 2 => ProblemType.BinaryClassification
case n if (n > 2) => {
log.info("MultiClassification Problem : Top K LOCOs by absolute value")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a very user friendly message. Let's log some meaningful message for any problem type with some clear and comprehensible explanation.

featureArray.update(i, (oldInd, oldVal))
i += 1
val top = $(topKStrategy) match {
case s if s == TopKStrategy.Abs.entryName || problemType == ProblemType.MultiClassification => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a getter method for strategy def getTopKStrategy: TopKStrategy and then use it here instead of string equality: $(topKStrategy) -> problemType match { case (TopKStrategy.Abs, ProblemType.MultiClassification) => ... }

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please also create a function for each of the cases to avoid this method growing large

var positiveCount = 0
// Size of negative heap
var negativeCount = 0
for {i <- 0 until filledSize} {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use while - it's faster

if (positiveCount > k) { // remove the lowest element if the heap size goes from 5 to 6
positiveMaxHeap.dequeue()
}
} else if (max < 0.0) { // if negative LOCO then add it to negative heap
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about max == 0.0? it's not handled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. LOCOs of 0.0 are not interesting

for {i <- 0 until filledSize} {
val (oldInd, oldVal) = featureArray(i)
val diffs = computeDiffs(i, oldInd, featureArray, featureSize, baseScore)
val max = if (problemType == ProblemType.Regression) diffs(0) else diffs(1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems very hacky and error prone - if (problemType == ProblemType.Regression) diffs(0) else diffs(1), perhaps add a helper function?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is where you would use the indexToExamine variable I suggested above

val parsed = insights.collect(name, transformer.getOutput())
.map { case (n, i) => n -> RecordInsightsParser.parseInsights(i) }
parsed.foreach { case (_, in) =>
in.head._1.columnName == "1_1_1_1" || in.last._1.columnName == "3_3_3_3" shouldBe true
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does this even mean in.head._1.columnName == "1_1_1_1" || in.last._1.columnName == "3_3_3_3"?
please add a withClue around it to allow easier debugging in case of failures.

val label = labelNoRes.copy(isResponse = true)
val testDataMeta = addMetaData(testData, "features", 5)
val sparkModel = new OpLogisticRegression().setInput(label, featureVector).fit(testData)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove redundant lines

case 0 => ProblemType.Unknown
case 1 => ProblemType.Regression
case 2 => ProblemType.BinaryClassification
case n if (n > 2) => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about the multiclass issue - I think it makes sense to return the difference in score for the winning category so maybe instead of having this find the problem type it should return the index to examine (also move it down to the base score function line 117:

val baseResult = modelApply(labelDummy, features)
val baseScore = baseResult.score
val indexToExamine = baseScore.length match {
case 0 => throw new RuntimeExeption("model does not produce scores for insights")
case 1 => 0
case 2 => 1
case n if (n > 2) => baseResult.prediction

and then always use this index for pulling out the diff for the heaps

topAbs.sortBy { case (_, v, _) => -math.abs(v) }

}
case s if s == TopKStrategy.PositiveNegative.entryName => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if we always kept the positive and negative heap and then just check the strategy when we return the insights?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that! This would keep the code simpler.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be slower when we want to return the top by abs value. Is there an easy way to go from 2 Priority Queues to 1?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be slightly slower but not by much if we assume that (or enforce that) k is small. To go to one q - just get all the answers and sort them. Again you are right that it is slower but if we limit k to ~100 it is a pretty small difference.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or you can get the first of each que and then basically keep popping off the last q that you took from for comparison - then it is not much slower you just have a sightly larger memory footprint

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another issue @leahmcguire. For Multiclassification, we return the highest absolute LOCO value, and for the other strategy we look at the winning category (baseScore). Getting the top Positive and negative for winning class doesn't guarantee the highest absolute value : what if this absolute value was from another class?

Copy link
Collaborator

@leahmcguire leahmcguire Apr 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - I think that the fact that we were just taking the highest absolute value was a mistake. It allows for things like baseScore = (0.1, 0.3, 0.6) new score = (0.3, 0.1, 0.6) to be the difference we report, which seems wrong. Always choosing the change in the winning value will mean that the change we report always has to do with the score that won, even if it is not the largest change in score. So yes we would change strategy for multiclass in absolute value as well - but I think it is slightly more principled...WDYT?

Copy link
Contributor Author

@michaelweilsalesforce michaelweilsalesforce Apr 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now the question is should we apply the same logic for Binary Classification.
So far the spikes were focused on the positive class (label 1.0) because we thought it was the the most interesting info to report.

However, recent spikes witness a strong relationship between LOCO scores and prediction scores (cc @crupley) : Rows with the "highest scores" had on average its top absolute LOCOs being mostly positive. For "lowest scores", those locos are mostly negative.
One can also argue that LOCO for class 0 = - LOCO for class 1, hence getting top Positives and Negatives is sufficient.
Maybe something we might need to add to the output is the base prediciton.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that is interesting. But wouldn't it be confusing to interpret for Binary if we choose the winning value because they are symetric? the scores will always some to 1 so a positive contribution to the prediction of 1 will be the same negative contribution to the prediction of 0. If we switch based on the prediction they will have to keep in mind the winning prediction to interpret the effect - this is something you need to do in multiclass but seems like an extra complication in binary.

while (i < filledSize) {
val (oldInd, oldVal) = featureArray(i)
val diffs = computeDiffs(i, oldInd, featureArray, featureSize, baseScore)
val max = diffs(indexToExamine)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe rename to diffToExamine

Copy link
Collaborator

@leahmcguire leahmcguire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just minor stuff then LGTM

negativeMaxHeap.dequeue()
} // Not keeping LOCOs with value 0
}
featureArray.update(i, (oldInd, oldVal))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it make sense to move the update of the array back tot the old version into the compute diffs function?

override def transformFn: OPVector => TextMap = (features) => {
val baseResult = modelApply(labelDummy, features)
val baseScore = baseResult.score
modelApply(labelDummy, features).prediction
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

think line 148 is leftover from a refactor

case n if (n > 2) => baseResult.prediction.toInt
}
val topPosNeg = returnTopPosNeg(filledSize, featureArray, featureSize, baseScore, k, indexToExamine)
val top = getTopKStrategy match {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getTopKStrategy match {
   case TopKStrategy.Abs =>
   case TopKStrategy.PositiveNegative =>  
}

Copy link
Collaborator

@tovbinm tovbinm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, one minor comment on match statement

@michaelweilsalesforce michaelweilsalesforce merged commit 40b51c7 into master Apr 5, 2019
@michaelweilsalesforce michaelweilsalesforce deleted the mw/LOCO-improvement branch April 5, 2019 19:53
@tovbinm tovbinm mentioned this pull request Apr 10, 2019
@shenzgang
Copy link

hello!How does the multi-category evaluator scale to compute roc,pr, and FPR and TPR for each category?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants