Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better names to calibreated linear classification models #3034

Merged
merged 17 commits into from
Mar 25, 2019

Conversation

wschin
Copy link
Member

@wschin wschin commented Mar 20, 2019

Fix #3016 by renaming LogisticRegressionBinaryTrainer to LbfgsLogisticRegressionTrainer. Note that for multiclass case, we have LbfgsMaximumEntropyTrainer. In addition, as SDCA (aka dual coordinate descent methods) outperforms L-BFGS when training logistic regression models, we should not make L-BFGS looks like the default trainer of logistic regression model.

This link contains our the conclusion of those new names.

@wschin wschin added the API Issues pertaining the friendly API label Mar 20, 2019
@wschin wschin self-assigned this Mar 20, 2019
@codecov
Copy link

codecov bot commented Mar 20, 2019

Codecov Report

❗ No coverage uploaded for pull request base (master@5f9be36). Click here to learn what that means.
The diff coverage is 96.22%.

@@            Coverage Diff            @@
##             master    #3034   +/-   ##
=========================================
  Coverage          ?   72.52%           
=========================================
  Files             ?      804           
  Lines             ?   144157           
  Branches          ?    16178           
=========================================
  Hits              ?   104552           
  Misses            ?    35193           
  Partials          ?     4412
Flag Coverage Δ
#Debug 72.52% <96.22%> (?)
#production 68.16% <80%> (?)
#test 88.72% <100%> (?)
Impacted Files Coverage Δ
...LogisticRegression/MulticlassLogisticRegression.cs 65.87% <ø> (ø)
test/Microsoft.ML.Functional.Tests/Training.cs 100% <100%> (ø)
...est/Microsoft.ML.StaticPipelineTesting/Training.cs 99.28% <100%> (ø)
...Microsoft.ML.Tests/TrainerEstimators/LbfgsTests.cs 98.02% <100%> (ø)
...est/Microsoft.ML.Tests/FeatureContributionTests.cs 98.55% <100%> (ø)
...est/Microsoft.ML.Predictor.Tests/TestPredictors.cs 63.8% <100%> (ø)
test/Microsoft.ML.Tests/Scenarios/OvaTest.cs 100% <100%> (ø)
...soft.ML.Tests/PermutationFeatureImportanceTests.cs 100% <100%> (ø)
.../Standard/LogisticRegression/LogisticRegression.cs 94.95% <100%> (ø)
test/Microsoft.ML.Functional.Tests/Evaluation.cs 100% <100%> (ø)
... and 4 more


/// <summary>
/// Binary Classification trainer estimators.
/// </summary>
public static class LbfgsBinaryClassificationStaticExtensions
public static class LbfgsBinaryExtensions
Copy link
Contributor

@artidoro artidoro Mar 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LbfgsBinaryExtensions [](start = 24, length = 21)

I think these are just static extensions, if possible could you keep static in the name of the class? #Resolved

Copy link
Contributor

@artidoro artidoro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@yaeldekel
Copy link

yaeldekel commented Mar 20, 2019

    [TlcModule.EntryPoint(Name = "Trainers.LogisticRegressionBinaryClassifier",

Should we update the name of the entry point as well? #WontFix


Refers to: src/Microsoft.ML.StandardTrainers/Standard/LogisticRegression/LogisticRegression.cs:407 in ba22e12. [](commit_id = ba22e12, deletion_comment = False)

@wschin
Copy link
Member Author

wschin commented Mar 20, 2019

    [TlcModule.EntryPoint(Name = "Trainers.LogisticRegressionBinaryClassifier",

No for BC.


In reply to: 474926238 [](ancestors = 474926238)


Refers to: src/Microsoft.ML.StandardTrainers/Standard/LogisticRegression/LogisticRegression.cs:407 in ba22e12. [](commit_id = ba22e12, deletion_comment = False)


namespace Microsoft.ML.Trainers
{

/// <include file='doc.xml' path='doc/members/member[@name="LBFGS"]/*' />
/// <include file='doc.xml' path='docs/members/example[@name="LogisticRegressionBinaryClassifier"]/*' />
public sealed partial class LogisticRegressionBinaryTrainer : LbfgsTrainerBase<LogisticRegressionBinaryTrainer.Options,
public sealed partial class LbfgsLogisticRegressionTrainer : LbfgsTrainerBase<LbfgsLogisticRegressionTrainer.Options,
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Mar 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LbfgsLogisticRegressionTrainer [](start = 32, length = 30)

Can you call it LbfgsBinaryTrainer? #Resolved

Copy link
Member Author

@wschin wschin Mar 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks fine if we consider this trainer and its SDCA counterpart (they all solve LR). However, it will break the consistency between LBFGS trainers. Please take a look at Iteration 3.
LBFGS trainer names:
LbfgsLogisticRegressionTrainer
LbfgsPoissonRegressionTrainer
LbfgsMaximumEntropyTrainer

Note that we can't drop model names (such as LogisticRegression) because we need PoissonRegression to warn users that PoissonRegression is not a common case.


In reply to: 267460936 [](ancestors = 267460936)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sfilipi has comment on your issue, and I share it as well.
Putting word Regression without Binary in the trainer is confusing.
If I would saw name of this trainer I would assume it's part of regression task.
Which is not.


In reply to: 267475700 [](ancestors = 267475700,267460936)

Copy link
Member

@sfilipi sfilipi Mar 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, given the target user base, IMO should do a one off for this one, and keep binary in the name.

(also this feels like a classic academic vs engineering solution of a problem :D )

Also, IMO not all details need to be in the name. IMO we don't need to distingush that this is Lbfgs logistic regression through renaming. It can be in the documentations. shorter names are better...


In reply to: 267477762 [](ancestors = 267477762,267475700,267460936)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will make the API looks like mlContext.BinaryClassification.LbfgsBinary(...) so I switch to SDCA-style name LbfgsCalibrated.


In reply to: 267481155 [](ancestors = 267481155,267477762,267475700,267460936)

Copy link
Member Author

@wschin wschin Mar 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, given the target user base, IMO should do a one off for this one, and keep binary in the name.

(also this feels like a classic academic vs engineering solution of a problem :D )

Also, IMO not all details need to be in the name. IMO we don't need to distingush that this is Lbfgs logistic regression through renaming. It can be in the documentations. shorter names are better...

There are several trainers returning the same model types. Only this one is called LogisticRegression and this one has been outperformed by other algorithms. Thus, I want to append Lbfgs to emphasize that this is not the only trainer for logistic regression.

In reply to: 267477762 [](ancestors = 267477762,267475700,267460936)

#Resolved

@@ -516,7 +516,7 @@ public IClassificationLoss CreateComponent(IHostEnvironment env)
/// ]]>
/// </format>
/// </example>
public static LogisticRegressionBinaryTrainer LogisticRegression(this BinaryClassificationCatalog.BinaryClassificationTrainers catalog,
public static LbfgsLogisticRegressionTrainer LbfgsLogisticRegressio(this BinaryClassificationCatalog.BinaryClassificationTrainers catalog,
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Mar 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LbfgsLogisticRegressio [](start = 53, length = 22)

you miss n in the end.
And I don't think this new verb any better than previous one.
More importantly I don't understand why you even changing this? We had @agoswami issue and PR regarding names, where we spend quite a lot of effort in attempt to figure out names, why you change it out of blue now? #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think my changes violate his rule.


In reply to: 267478532 [](ancestors = 267478532)

Copy link
Contributor

@rogancarr rogancarr Mar 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR makes me nervous so close to ship date. Let's get consensus in #3016 before we make the change. Right now, I feel that the changes are inconsistent and unclear.


In reply to: 267481247 [](ancestors = 267481247,267478532)

@@ -63,7 +63,7 @@ public static void Example()
.Append(ml.Transforms.Text.FeaturizeText("TextFeatures", "Text"))
.Append(ml.Transforms.Concatenate("Features", "TextFeatures", "age", "fnlwgt",
"education-num", "capital-gain", "capital-loss", "hours-per-week"))
.Append(ml.BinaryClassification.Trainers.LogisticRegression());
.Append(ml.BinaryClassification.Trainers.LbfgsCalibrated());
Copy link
Contributor

@rogancarr rogancarr Mar 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LbfgsCalibrated [](start = 57, length = 15)

I don't think we should call it this. LogisticRegression is fine but LbfgsCalibrated is too esoteric. #Resolved

Copy link
Contributor

@rogancarr rogancarr Mar 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Plus it's not a calibrated linear model. It's fitting the to the logistic loss function. #Resolved

Copy link
Member Author

@wschin wschin Mar 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as it returns CalibratedModelParametersBase<LinearBinaryModelParameters, PlattCalibrator, I feel LbfgsCalibrated is fine. SDCA is doing the same. #Resolved

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically true, but wouldn't it be clearer from a user perspective to call it LogisticRegression? Everybody knows what that is. Behind the scenes, many packages solve LR with variants of L-BFGS (e.g. Spark), so I don't think it would be misleading to say that it's called LogisticRegression. We can put technical details like "Implemented with L-BFGS etc." in the docs.

Look at FastTree. We don't call that CalibratedLambdaMART ;)


In reply to: 267857743 [](ancestors = 267857743)

Copy link
Member Author

@wschin wschin Mar 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A problem here is that L-BFGS is not good for large-scale sparse data sets. SDCA in that case is usually much faster. If we call LBFGS-LogisticRegression just LogisticRegression, external users may view LBFGS as the default solver to all logistic regression problems. #Resolved

@@ -63,7 +63,7 @@ public static void Example()
.Append(ml.Transforms.Text.FeaturizeText("TextFeatures", "Text"))
.Append(ml.Transforms.Concatenate("Features", "TextFeatures", "age", "fnlwgt",
"education-num", "capital-gain", "capital-loss", "hours-per-week"))
.Append(ml.BinaryClassification.Trainers.LogisticRegression());
.Append(ml.BinaryClassification.Trainers.LbfgsLogisticRegression());
Copy link
Contributor

@rogancarr rogancarr Mar 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you feel about making the other calibrated linear trainers, like SDCA into XyzLogisticRegression(). #Resolved

Copy link
Contributor

@rogancarr rogancarr Mar 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can always add that at a later time. #Resolved

rogancarr
rogancarr previously approved these changes Mar 22, 2019
Copy link
Contributor

@rogancarr rogancarr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@wschin wschin requested a review from abgoswam March 22, 2019 20:57
@abgoswam
Copy link
Member

abgoswam commented Mar 22, 2019

Here is my understanding of how we arrived at naming convention of Trainers/ModelParameters

This PR proposes to modify the existing APIs as follows :

  • Renames several MLContext names
    • LogisticRegression to LbfgsLogisticRegression
    • PoissonRegression to LbfgsPoissonRegression
  • Renames Class names
    • LogisticRegressionBinaryTrainer to LbfgsLogisticRegressionTrainer

Why do we think adding prefix "Lbfgs" in the name is a better alternative than what we have currently ? In fact to me it seems having 'Lbfgs' as prefix might throw users off

  1. users may miss even seeing LogisticRegression if it has Lbfgs prefix .

  2. create more doubt in their minds . W can have LogisticRegression as the API name and use documentation to clarify what the API implements behind the scene. Doesn't seem right for API design to contain implementation details.

@eerhardt @sfilipi #Resolved

@Ivanidzo4ka Ivanidzo4ka self-requested a review March 22, 2019 22:05
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me block it for now.
I don't think we have agreement among all of us regarding how positive this change is.

@eerhardt
Copy link
Member

eerhardt commented Mar 22, 2019

Why do we think adding prefix "Lbfgs" in the name is a better alternative than what we have currently ? In fact to me it seems having 'Lbfgs' as prefix might throw users off

I believe @wschin answered this in the top comment to the PR:

In addition, as SDCA (aka dual coordinate descent methods) outperforms L-BFGS when training logistic regression models, we should not make L-BFGS looks like the default trainer of logistic regression model.

It seems to me that is his exact intention here. Don't use this one when you want to do LR. Or at least, don't think this is the "good"/"default" one. We have a better one. #Resolved

@rogancarr
Copy link
Contributor

rogancarr commented Mar 22, 2019

I think that it would also make sense to convert the binary SDCA trainer to be called SdcaLogisticRegression if we make this change to Lbfgs. That way, they both get the "LR Brand Name". #Resolved

@abgoswam
Copy link
Member

abgoswam commented Mar 22, 2019

In addition, as SDCA (aka dual coordinate descent methods) outperforms L-BFGS when training logistic regression models, we should not make L-BFGS looks like the default trainer of logistic regression model.

  • The PR proposes adding "LbfgsLogisticRegression" in API name itself to make it concrete that this particular API uses "Lbfgs" as the optimization algorithm. Is that the correct approach ?

  • Are we sure we are using "LBFGS" as the optimizing algorithm ? The internal repo refers to this paper, which leads me to believe that the optimization algo that ML.NET uses is not LBFGS itself, but rather one of its variants "OWN-QN". All the more reason to not include Lbfgs in the name of the API itself

  • Scikit-learn solves this by adding parameter called "solver"

solver : str, {‘newton-cg’, ‘lbfgs’, ‘liblinear’, ‘sag’, ‘saga’}, default: ‘liblinear’.
Algorithm to use in the optimization problem. #Resolved

@wschin
Copy link
Member Author

wschin commented Mar 23, 2019

In addition, as SDCA (aka dual coordinate descent methods) outperforms L-BFGS when training logistic regression models, we should not make L-BFGS looks like the default trainer of logistic regression model.

* The PR  proposes adding "LbfgsLogisticRegression" in API name itself to make it  concrete  that this  particular  API  uses  "Lbfgs" as the optimization algorithm.  Is that the correct approach ?

LBFGS can mean LBFGS family algorithms.

* Are we sure we are using "LBFGS" as the optimizing algorithm ?  The internal repo  refers to this [paper](https://www.microsoft.com/en-us/research/publication/scalable-training-of-l1-regularized-log-linear-models/?from=http%3A%2F%2Fresearch.microsoft.com%2Fapps%2Fpubs%2Fdefault.aspx%3Fid%3D78900),  which leads me to believe that the optimization algo that ML.NET uses is  not LBFGS itself, but rather  one of its variants "OWN-QN". All the more reason to _not_ include Lbfgs in the name of the API itself

It's a variant of LBFGS, as stated in your link.

* Scikit-learn [solves this](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) by adding parameter  called "solver"

solver : str, {‘newton-cg’, ‘lbfgs’, ‘liblinear’, ‘sag’, ‘saga’}, default: ‘liblinear’.
Algorithm to use in the optimization problem.

No. It's not necessary. L1-norm --> a variant of LBFGS, No L1-norm --> LBFGS.

#Resolved

@abgoswam
Copy link
Member

abgoswam commented Mar 25, 2019

Couple of notes after discussing with @wschin

  • Since ML.NET is type-safe, we cannot really follow the same paradigm as Scikit-Learn's paradigm in the way it allows for specifying the optimization algorithm using the solver parameter.

  • We can take inspiration from Spark.ML to finilize the API

  • SDCA has other subtelities to it e.g. supports SVM loss, calibration/non-calibration etc. We can keep the SDCA API as is. Note that the current calibrated SDCA doesn't allow user to choose another loss function, so it always produces a logistic regression model. #Resolved

@Ivanidzo4ka Ivanidzo4ka dismissed their stale review March 25, 2019 21:06

revoking review

Copy link
Member

@abgoswam abgoswam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

Remove empty line to trigger build
@wschin wschin changed the title Rename LogisticRegressionBinaryTrainer to LbfgsLogisticRegressionTrainer Better names to calibreated linear classification models Mar 25, 2019
@rogancarr rogancarr dismissed their stale review March 25, 2019 21:25

revoking review

@@ -19,20 +19,20 @@
using Microsoft.ML.Trainers;
using Microsoft.ML.Transforms;

[assembly: LoadableClass(typeof(SymbolicSgdTrainer), typeof(SymbolicSgdTrainer.Options),
[assembly: LoadableClass(typeof(SymbolicSgdLogisticRegressionBinaryTrainer), typeof(SymbolicSgdLogisticRegressionBinaryTrainer.Options),
Copy link
Contributor

@rogancarr rogancarr Mar 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SymbolicSgdLogisticRegressionBinaryTrainer [](start = 32, length = 42)

LogisticRegressionBinaryTrainer => LbfgsLogisticRegressionTrainer. Shall SymbolicSgdBinaryTrainer => SymbolicSgdLogisticRegressionTrainer instead of SymbolicSgdLogisticRegressionBinaryTrainer? Right now, the namings don't line up. #Resolved

Copy link
Member

@abgoswam abgoswam Mar 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rogancarr..could you cross-check ? To me the Class names do seem to match up fine, with both having the word Binary in the name

  • LbfgsLogisticRegressionBinaryTrainer
  • SymbolicSgdLogisticRegressionBinaryTrainer

In reply to: 268859710 [](ancestors = 268859710)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. Our conclusion was having LR and Binary for trainer classes and only LR for APIs.


In reply to: 268865148 [](ancestors = 268865148,268859710)

Copy link
Contributor

@rogancarr rogancarr Mar 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. I was mixing up catalogs & classes. #Resolved

@@ -181,7 +181,7 @@ public static class StandardTrainersCatalog
}

/// <summary>
/// Predict a target using a linear classification model trained with <see cref="SdcaCalibratedBinaryTrainer"/>.
/// Predict a target using a linear classification model trained with <see cref="SdcaLogisticRegressionBinaryTrainer"/>.
Copy link
Contributor

@rogancarr rogancarr Mar 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SdcaLogisticRegressionBinaryTrainer [](start = 89, length = 35)

Similar to SymSgd, now Sdca has LogisticRegression and Binary. #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, adding both of LogisticRegression and Binary to trainer classes was our decision. For API, we only append LogisticRegression because Binary naturally comes from their context.


In reply to: 268859905 [](ancestors = 268859905)

@wschin wschin merged commit 8730c87 into dotnet:master Mar 25, 2019
@wschin wschin deleted the sync-lbfgs-lr-me branch March 25, 2019 23:47
@ghost ghost locked as resolved and limited conversation to collaborators Mar 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
API Issues pertaining the friendly API
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants