Fixing ModelParameter discrepancies #2968

abgoswam · 2019-03-14T23:31:42Z

Fixes to the ~7 odd ModelParameter types which were inconsistent with the rest of the ModelParameter types

Couple of notes:

PR follows naming convention used by other ModelParameter types in the codebase
- {AlgoName}(optional){TypeOfTask}ModelParameters
- {TypeOfTask} is added only when needed to distinguish between Binary , Regression or Multiclass
ModelParameter types do not use the word Classification in the {TypeOfTask} . PR follows that convention.

EDIT : MulticlassLogisticRegressionModelParameters is being refactored by separate issue #1100 . SO not fixing that in this PR

wschin · 2019-03-14T23:44:09Z

src/Microsoft.ML.Ensemble/Trainer/Multiclass/MulticlassDataPartitionEnsembleTrainer.cs

@@ -66,7 +66,7 @@ public Arguments()
                            // estimator, as opposed to a regular trainer.
                            var trainerEstimator = new LogisticRegressionMulticlassClassificationTrainer(env, LabelColumnName, FeatureColumnName);
                            return TrainerUtils.MapTrainerEstimatorToTrainer<LogisticRegressionMulticlassClassificationTrainer,
-                                MulticlassLogisticRegressionModelParameters, MulticlassLogisticRegressionModelParameters>(env, trainerEstimator);
+                                LogisticRegressionMulticlassModelParameters, LogisticRegressionMulticlassModelParameters>(env, trainerEstimator);


We can NOT swap them. Logistic regression is never a multiclass classification model while multi-class logistic regression is an alternative name of multinomial logistic regression. Can you revert this change? I will handle this in my issue #1100. I am refactorizing it. #Resolved

One alternative name we can use is SoftmaxRegression (also mentioned in wikipedia link above).

While refactoring, could you fix the name of the trainer estimator for this as well.

I will revert this, and wait for your refactoring PR. Sounds good ?

In reply to: 265806976 [](ancestors = 265806976)

codecov · 2019-03-15T00:09:07Z

Codecov Report

Merging #2968 into master will increase coverage by <.01%.
The diff coverage is 86.41%.

@@            Coverage Diff             @@
##           master    #2968      +/-   ##
==========================================
+ Coverage   72.35%   72.35%   +<.01%     
==========================================
  Files         803      803              
  Lines      143296   143296              
  Branches    16155    16155              
==========================================
+ Hits       103675   103679       +4     
+ Misses      35194    35191       -3     
+ Partials     4427     4426       -1

Flag	Coverage Δ
#Debug	`72.35% <86.41%> (ø)`	⬆️
#production	`68.06% <72.15%> (ø)`	⬆️
#test	`88.52% <100%> (ø)`	⬆️

Impacted Files	Coverage Δ
src/Microsoft.ML.LightGbm/LightGbmArguments.cs	`89.63% <ø> (ø)`	⬆️
...LogisticRegression/MulticlassLogisticRegression.cs	`67.46% <ø> (ø)`	⬆️
...crosoft.ML.StandardTrainers/Standard/SdcaBinary.cs	`72.68% <ø> (ø)`	⬆️
...rs/Standard/PoissonRegression/PoissonRegression.cs	`88.57% <ø> (ø)`	⬆️
src/Microsoft.ML.FastTree/GamModelParameters.cs	`46.51% <0%> (ø)`	⬆️
src/Microsoft.ML.FastTree/FastTreeArguments.cs	`85.38% <0%> (ø)`	⬆️
src/Microsoft.ML.FastTree/GamTrainer.cs	`90.38% <0%> (ø)`	⬆️
test/Microsoft.ML.Functional.Tests/Training.cs	`100% <100%> (ø)`	⬆️
...est/Microsoft.ML.Predictor.Tests/TestPredictors.cs	`63.8% <100%> (ø)`	⬆️
...icrosoft.ML.Functional.Tests/DataTransformation.cs	`100% <100%> (ø)`	⬆️
... and 46 more

wschin

LGTM.

TomFinley · 2019-03-15T18:07:15Z

src/Microsoft.ML.Mkl.Components/OlsLinearRegression.cs

@@ -24,16 +24,16 @@
    OlsTrainer.LoadNameValue,
    OlsTrainer.ShortName)]

-[assembly: LoadableClass(typeof(OrdinaryLeastSquaresRegressionModelParameters), null, typeof(SignatureLoadModel),
+[assembly: LoadableClass(typeof(OlsModelParameters), null, typeof(SignatureLoadModel),


OlsModelParameters [](start = 32, length = 18)

So, general .NET naming guidelines seem to suggest that we avoid acronyms. That is, OrdinaryLeastSquaresRegressionModelParameters is right, and OlsTrainer is wrong. Similar with PCA. Have we made a conscious decision to discard the .NET guidelines in this case? If we have, could I see where that discussion happened so I can better understand the rationale? It is not in the linked issue, but may have occurred somewhere else.Thanks. #Resolved

I think the discussion is in this thread: #2762 #Resolved

The summary is here : #2762 (comment)

This came out from suggestion from @sfilipi and @eerhardt .

"For some trainers estimators it is OK to use acronyms for {AlgoName} . This is typically when (a) the expanded name doesn't provide more value to typical users than the acronym or (b) the names becomes too long (c) the acronym is unique in the field." #Resolved

Ah, all right, thanks for linking the relevant issues @artidoro and @abgoswam. #Resolved

artidoro · 2019-03-15T18:43:08Z

...crosoft.ML.StandardTrainers/Standard/MulticlassClassification/MulticlassNaiveBayesTrainer.cs


 [assembly: LoadableClass(typeof(void), typeof(NaiveBayesTrainer), null, typeof(SignatureEntryPointModule), NaiveBayesTrainer.LoadName)]

 namespace Microsoft.ML.Trainers
 {
-    public sealed class NaiveBayesTrainer : TrainerEstimatorBase<MulticlassPredictionTransformer<MulticlassNaiveBayesModelParameters>, MulticlassNaiveBayesModelParameters>
+    public sealed class NaiveBayesTrainer : TrainerEstimatorBase<MulticlassPredictionTransformer<NaiveBayesModelParameters>, NaiveBayesModelParameters>


NaiveBayesTrainer [](start = 24, length = 17)

Why does NaiveBayesTrainer not have Multiclass or MulticlassClassification in its name? #Resolved

NaiveBayes is a very general probabilistic technique, and could be applied to many different tasks, so I don't understand the rational for removing the task from its name?

In reply to: 266106131 [](ancestors = 266106131)

#2762 (comment)

{TypeOfTask} is added only only when the algorithm supports multiple kinds of tasks . When added, we will spell out each task fully. #Resolved

We renamed it as per the discussion here.

#2762 (comment)

• {TypeOfTask} is added only only when the algorithm supports multiple kinds of tasks . When added, we will spell out each task fully.

Do u have any suggestions ?

In reply to: 266107249 [](ancestors = 266107249,266106131)

My concern is that if we add this algorithm for binaryclassification it would require a breaking change to add Multiclass to the name NaiveBayes. I think that the above guideline works well when the algorithm can only be applied to one learning task. But this algorithm can be applied at least to binary and multiclass so I would suggest to add MulticlassClassification to its name, similarly for the ModelParameters.

In reply to: 266136420 [](ancestors = 266136420,266107249,266106131)

I see. Currently in ML.NET we only have NaiveBayes under the MulticlassClassification MLContext

So your concern is about a possible breaking change in the future if we want to add NaiveBayes inside the BinaryClassification MLContext ? Would we ever want to do that ?

In reply to: 266138507 [](ancestors = 266138507,266136420,266107249,266106131)

Yes that was my concern, maybe @[email protected] has an opinion about this?

In reply to: 266143059 [](ancestors = 266143059,266138507,266136420,266107249,266106131)

@TomFinley . Would be great to get your thoughts on this. #Resolved

I would keep multiclass on NaiveBayes too.

In reply to: 266143921 [](ancestors = 266143921,266143059,266138507,266136420,266107249,266106131)

Looks like we favor naming it NaiveBayesMulticlass. Updating accordingly.

In reply to: 266568218 [](ancestors = 266568218,266143921,266143059,266138507,266136420,266107249,266106131)

sfilipi · 2019-03-18T17:50:26Z

src/Microsoft.ML.FastTree/RandomForestClassification.cs

@@ -49,7 +49,7 @@ internal FastForestOptionsBase()
        }
    }

-    public sealed class FastForestClassificationModelParameters :
+    public sealed class FastForestMulticlassModelParameters :


FastForestMulticlassModelParameters [](start = 24, length = 35)

thank you for not making it FastForestMulticlassClassificationModelParameters :) #WontFix

sfilipi

TomFinley

Thanks @abgoswam !

eerhardt · 2019-03-18T20:26:03Z

src/Microsoft.ML.FastTree/GamClassification.cs

@@ -171,7 +171,7 @@ private protected override SchemaShape.Column[] GetOutputColumnsCore(SchemaShape
    /// <summary>
    /// The model parameters class for Binary Classification GAMs
    /// </summary>
-    public sealed class BinaryClassificationGamModelParameters : GamModelParametersBase, IPredictorProducing<float>
+    public sealed class GamBinaryModelParameters : GamModelParametersBase, IPredictorProducing<float>


BinaryClassification?

Do we need to put Classification in the name? We do everywhere else. #Resolved

For ModelParameters we do not. Please see the other comment.

In reply to: 266626783 [](ancestors = 266626783)

eerhardt · 2019-03-18T20:26:56Z

...crosoft.ML.StandardTrainers/Standard/MulticlassClassification/MulticlassNaiveBayesTrainer.cs


 namespace Microsoft.ML.Trainers
 {
-    public sealed class NaiveBayesTrainer : TrainerEstimatorBase<MulticlassPredictionTransformer<MulticlassNaiveBayesModelParameters>, MulticlassNaiveBayesModelParameters>
+    public sealed class NaiveBayesMulticlassTrainer : TrainerEstimatorBase<MulticlassPredictionTransformer<NaiveBayesMulticlassModelParameters>, NaiveBayesMulticlassModelParameters>


NaiveBayesMulticlassTrainer => NaiveBayesMulticlassClassificationTrainer? Why are we dropping Classification from these names? #Resolved

as noted in the bug description, it seems for ModelParameters we are consistent in not using the word "Classification" . I am following the same convention in this PR.

Example:

LinearBinaryModelParameters

FastTreeBinaryModelParameters

I don't think we have solved the "Classification" debate completely yet #2623

Here is my observation :

For Trainers, we use "Classification"

For ModelParameters, we do not use "Classification"

In reply to: 266627123 [](ancestors = 266627123)

For Trainers, we use "Classification"
For ModelParameters, we do not use "Classification"

Why is this a valid policy? #Resolved

I cannot claim its a "valid" policy, but one observation I have is that

we do not use "Classification" anywhere for ModelParameters

CalibratedModelParametersBase

LinearBinaryModelParameters

in some cases it leads to long names e.g. FastForestMulticlassClassificationModelParameter

We have 2 options:

A. Follow the naming convention used by Trainers i.e. "<>BinaryClassification" and "MulticlassClassification" uniformly.
B. Not use "Classification" for ModelParameters .

Which one you favor ? @eerhardt @sfilipi #Resolved

eerhardt

fixing model parameter discrepencies

7b2fffe

abgoswam requested review from sfilipi and artidoro March 14, 2019 23:31

wschin reviewed Mar 14, 2019

View reviewed changes

abgoswam added 2 commits March 15, 2019 16:02

multiclass LR singe that refactoring is happening in a parallel PR

e5fff99

fix merge conflicts

0ca3031

wschin approved these changes Mar 15, 2019

View reviewed changes

TomFinley reviewed Mar 15, 2019

View reviewed changes

artidoro reviewed Mar 15, 2019

View reviewed changes

sfilipi reviewed Mar 18, 2019

View reviewed changes

sfilipi approved these changes Mar 18, 2019

View reviewed changes

TomFinley approved these changes Mar 18, 2019

View reviewed changes

abgoswam added 3 commits March 18, 2019 19:44

Merge branch 'master' into abgoswam/modelparameters

2788023

Merge branch 'master' into abgoswam/modelparameters

50fc607

review comments. Added Multiclass to NaiveBayes

64b8969

eerhardt reviewed Mar 18, 2019

View reviewed changes

abgoswam added 5 commits March 18, 2019 23:21

Drop Classification from trainer names - v1 (more trainers to follow)

dd3f4b8

Merge branch 'master' into abgoswam/modelparameters

df99292

multiclass LR will be handled separately

6fe26d3

Drop Classification from trainer names - v2 (all trainers taken care of)

2f0aba0

fix entrypoint file

26e2810

eerhardt approved these changes Mar 19, 2019

View reviewed changes

abgoswam merged commit 0831865 into dotnet:master Mar 19, 2019

abgoswam deleted the abgoswam/modelparameters branch March 20, 2019 20:13

ghost locked as resolved and limited conversation to collaborators Mar 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixing ModelParameter discrepancies #2968

Fixing ModelParameter discrepancies #2968

abgoswam commented Mar 14, 2019 •

edited

Loading

wschin Mar 14, 2019 •

edited by abgoswam

Loading

abgoswam Mar 15, 2019

codecov bot commented Mar 15, 2019 •

edited

Loading

wschin left a comment

TomFinley Mar 15, 2019 •

edited by abgoswam

Loading

artidoro Mar 15, 2019 •

edited by abgoswam

Loading

abgoswam Mar 15, 2019 •

edited

Loading

TomFinley Mar 18, 2019 •

edited by abgoswam

Loading

artidoro Mar 15, 2019 •

edited by abgoswam

Loading

artidoro Mar 15, 2019

abgoswam Mar 15, 2019 •

edited

Loading

abgoswam Mar 15, 2019

artidoro Mar 15, 2019

abgoswam Mar 15, 2019 •

edited

Loading

artidoro Mar 15, 2019

abgoswam Mar 15, 2019 •

edited

Loading

sfilipi Mar 18, 2019

abgoswam Mar 18, 2019

sfilipi Mar 18, 2019 •

edited

Loading

sfilipi left a comment

TomFinley left a comment

eerhardt Mar 18, 2019 •

edited by abgoswam

Loading

abgoswam Mar 18, 2019

eerhardt Mar 18, 2019 •

edited by abgoswam

Loading

abgoswam Mar 18, 2019

eerhardt Mar 18, 2019 •

edited by abgoswam

Loading

abgoswam Mar 18, 2019 •

edited

Loading

eerhardt left a comment

Fixing ModelParameter discrepancies #2968

Fixing ModelParameter discrepancies #2968

Conversation

abgoswam commented Mar 14, 2019 • edited Loading

wschin Mar 14, 2019 • edited by abgoswam Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Mar 15, 2019 • edited Loading

Codecov Report

wschin left a comment

Choose a reason for hiding this comment

TomFinley Mar 15, 2019 • edited by abgoswam Loading

Choose a reason for hiding this comment

artidoro Mar 15, 2019 • edited by abgoswam Loading

Choose a reason for hiding this comment

abgoswam Mar 15, 2019 • edited Loading

Choose a reason for hiding this comment

TomFinley Mar 18, 2019 • edited by abgoswam Loading

Choose a reason for hiding this comment

artidoro Mar 15, 2019 • edited by abgoswam Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abgoswam Mar 15, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abgoswam Mar 15, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abgoswam Mar 15, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sfilipi Mar 18, 2019 • edited Loading

Choose a reason for hiding this comment

sfilipi left a comment

Choose a reason for hiding this comment

TomFinley left a comment

Choose a reason for hiding this comment

eerhardt Mar 18, 2019 • edited by abgoswam Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eerhardt Mar 18, 2019 • edited by abgoswam Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eerhardt Mar 18, 2019 • edited by abgoswam Loading

Choose a reason for hiding this comment

abgoswam Mar 18, 2019 • edited Loading

Choose a reason for hiding this comment

eerhardt left a comment

Choose a reason for hiding this comment

abgoswam commented Mar 14, 2019 •

edited

Loading

wschin Mar 14, 2019 •

edited by abgoswam

Loading

codecov bot commented Mar 15, 2019 •

edited

Loading

TomFinley Mar 15, 2019 •

edited by abgoswam

Loading

artidoro Mar 15, 2019 •

edited by abgoswam

Loading

abgoswam Mar 15, 2019 •

edited

Loading

TomFinley Mar 18, 2019 •

edited by abgoswam

Loading

artidoro Mar 15, 2019 •

edited by abgoswam

Loading

abgoswam Mar 15, 2019 •

edited

Loading

abgoswam Mar 15, 2019 •

edited

Loading

abgoswam Mar 15, 2019 •

edited

Loading

sfilipi Mar 18, 2019 •

edited

Loading

eerhardt Mar 18, 2019 •

edited by abgoswam

Loading

eerhardt Mar 18, 2019 •

edited by abgoswam

Loading

eerhardt Mar 18, 2019 •

edited by abgoswam

Loading

abgoswam Mar 18, 2019 •

edited

Loading