Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify API for advanced settings. (SDCA) #2093

Merged
merged 13 commits into from
Jan 18, 2019

Conversation

abgoswam
Copy link
Member

@abgoswam abgoswam commented Jan 9, 2019

Towards #1798 .

This PR addresses only the SDCA agos. For the other remaining components, we will have separate PRs.

The following changes have been made:

  1. Two public extension methods, one for simple arguments and the other for advanced options
  2. Make constructors internal
  3. Pass Options objects as arguments instead of Action delegate
  4. Rename Arguments to Options
  5. Rename Options objects as options (instead of args or advancedSettings used so far)
  6. Added Help text as the XML comment of Options

@abgoswam abgoswam changed the title WIP : Modify API for advanced settings. (SDCA) Modify API for advanced settings. (SDCA) Jan 14, 2019
@@ -25,7 +25,8 @@ private class TestData
public float[] b;
}

[ConditionalFact(typeof(Environment), nameof(Environment.Is64BitProcess))] // TensorFlow is 64-bit only
[Fact(Skip = "TF Tests fail")]
Copy link
Member Author

@abgoswam abgoswam Jan 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not review this file. I am skipping TF tests only because they seem to be flaky at this point.

Comment here
#Closed

@@ -69,6 +69,13 @@ public abstract class LinearTrainerBase<TTransformer, TModel> : TrainerEstimator
{
}

private protected LinearTrainerBase(IHostEnvironment env, string featureColumn, SchemaShape.Column labelColumn,
SchemaShape.Column weightColumn)
Copy link
Member

@sfilipi sfilipi Jan 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you need this? #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this to account for the corresponding changes in line #1728-1730

I saw some tests fail saying "Weights" column not found. The fix was to specify options.WeightColumn.IsExplicit in the call below.

TrainerUtils.MakeR4ScalarWeightColumn(options.WeightColumn, options.WeightColumn.IsExplicit)


In reply to: 248006972 [](ancestors = 248006972)

Copy link
Member Author

@abgoswam abgoswam Jan 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got rid of this


In reply to: 248058147 [](ancestors = 248058147,248006972)

@@ -1495,11 +1496,11 @@ internal override void Check(IHostEnvironment env)
_outputColumns = outCols.ToArray();
}

internal SdcaBinaryTrainer(IHostEnvironment env, Arguments args,
internal SdcaBinaryTrainer(IHostEnvironment env, Options options,
string featureColumn, string labelColumn, string weightColumn = null)
Copy link
Member

@sfilipi sfilipi Jan 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

string featureColumn, string labelColumn, string weightColumn = null [](start = 11, length = 69)

are those still needed? can they be passed to options? The other ctor below can go away than. #Resolved

Copy link
Member Author

@abgoswam abgoswam Jan 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.


In reply to: 248021483 [](ancestors = 248021483)

@@ -1594,7 +1595,7 @@ public sealed class StochasticGradientDescentClassificationTrainer :
internal const string UserNameValue = "Hogwild SGD (binary)";
internal const string ShortName = "HogwildSGD";

public sealed class Arguments : LearnerInputBaseWithWeight
public sealed class Options : LearnerInputBaseWithWeight
{
[Argument(ArgumentType.Multiple, HelpText = "Loss Function", ShortName = "loss", SortOrder = 50)]
public ISupportClassificationLossFactory LossFunction = new LogLossFactory();
Copy link
Member

@sfilipi sfilipi Jan 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ISupportClassificationLossFactory [](start = 19, length = 33)

XML docs #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created #2154 to fix this issue in bulk for several learners


In reply to: 248022036 [](ancestors = 248022036)

@sfilipi
Copy link
Member

sfilipi commented Jan 15, 2019

    /// </example>

please remove duplicate example #Resolved


Refers to: src/Microsoft.ML.StandardLearners/StandardLearnersCatalog.cs:116 in f0b9565. [](commit_id = f0b9565, deletion_comment = False)

/// <![CDATA[
/// [!code-csharp[SDCA](~/../docs/samples/docs/samples/Microsoft.ML.Samples/Dynamic/SDCA.cs)]
/// ]]></format>
/// </example>
Copy link
Member

@sfilipi sfilipi Jan 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove duplicate example #Resolved

Contracts.CheckValue(ctx, nameof(ctx));
var env = CatalogUtils.GetEnvironment(ctx);

return new StochasticGradientDescentClassificationTrainer(env, options);
Copy link
Contributor

@artidoro artidoro Jan 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment: everywhere else you don't skip the line. #Resolved

: base(env, featureColumn, TrainerUtils.MakeBoolScalarLabel(labelColumn), weightColumn)
{
Host.CheckNonEmpty(featureColumn, nameof(featureColumn));
Host.CheckNonEmpty(labelColumn, nameof(labelColumn));

_args = new Arguments();
_args = new Options();
Copy link
Contributor

@artidoro artidoro Jan 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_args [](start = 12, length = 5)

Could you make this _options? So that it is consistent with the general renaming? #Resolved

/// <param name="env">The environment to use.</param>
/// <param name="options">Advanced arguments to the algorithm.</param>
internal StochasticGradientDescentClassificationTrainer(IHostEnvironment env, Options options)
: base(env, options.FeatureColumn, TrainerUtils.MakeBoolScalarLabel(options.LabelColumn), TrainerUtils.MakeR4ScalarWeightColumn(options.WeightColumn, options.WeightColumn.IsExplicit))
Copy link
Contributor

@artidoro artidoro Jan 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TrainerUtils.MakeR4ScalarWeightColumn(options.WeightColumn, options.WeightColumn.IsExplicit) [](start = 102, length = 92)

I think you can replace this with:

options.WeightColumn.IsExplicit ? options.WeightColumn : null

That way you won't need the other constructor for the base. This is a strange problem from wanting to keep the optional for the weightColumn which is used in Maml. #Resolved

Copy link
Member Author

@abgoswam abgoswam Jan 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sfilipi who is taking a look at how best to handle the weights column

In this PR, I would like to keep it as is. This has ramifications for several tests.


In reply to: 248102195 [](ancestors = 248102195)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be a pretty simple fix, which reduces the number of constructors in the base class, I would check if it works :)


In reply to: 248104112 [](ancestors = 248104112,248102195)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modified. Thanks for debugging !


In reply to: 248104779 [](ancestors = 248104779,248104112,248102195)

{
public const string LoadNameValue = "SDCAMC";
public const string UserNameValue = "Fast Linear Multi-class Classification (SA-SDCA)";
public const string ShortName = "sasdcamc";
internal const string Summary = "The SDCA linear multi-class classification trainer.";

public sealed class Arguments : ArgumentsBase
public sealed class Options : ArgumentsBase
Copy link
Contributor

@artidoro artidoro Jan 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ArgumentsBase [](start = 38, length = 13)

Are we planning to rename this too? #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point the focus is to get the public APIs fixed.

Rename of internal classes / variables is low priority at this point.


In reply to: 248102673 [](ancestors = 248102673)

@@ -48,7 +48,7 @@ public void SdcaRegression()

var est = reader.MakeNewEstimator()
.Append(r => (r.label, score: ctx.Trainers.Sdca(r.label, r.features, maxIterations: 2,
onFit: p => pred = p, advancedSettings: s => s.NumThreads = 1)));
Copy link
Contributor

@artidoro artidoro Jan 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NumThreads [](start = 63, length = 10)

Would be good to have numthreads to 1.
Unfortunately, when that's not the case, it is more prone to having random failures from small discrepancies in test results especially on 32 bit machines. #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed


In reply to: 248104144 [](ancestors = 248104144)

@@ -88,7 +88,7 @@ public void SdcaRegressionNameCollision()
separator: ';', hasHeader: true);

var est = reader.MakeNewEstimator()
.Append(r => (r.label, r.Score, score: ctx.Trainers.Sdca(r.label, r.features, maxIterations: 2, advancedSettings: s => s.NumThreads = 1)));
Copy link
Contributor

@artidoro artidoro Jan 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.NumThreads = 1 [](start = 136, length = 15)

Same here. #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed


In reply to: 248104199 [](ancestors = 248104199)

@@ -121,8 +121,7 @@ public void SdcaBinaryClassification()
var est = reader.MakeNewEstimator()
.Append(r => (r.label, preds: ctx.Trainers.Sdca(r.label, r.features,
maxIterations: 2,
onFit: (p, c) => { pred = p; cali = c; },
advancedSettings: s => s.NumThreads = 1)));
Copy link
Contributor

@artidoro artidoro Jan 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s.NumThreads = 1 [](start = 42, length = 17)

And here. #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed


In reply to: 248104248 [](ancestors = 248104248)

@@ -169,8 +168,7 @@ public void SdcaBinaryClassificationNoCalibration()
var est = reader.MakeNewEstimator()
.Append(r => (r.label, preds: ctx.Trainers.Sdca(r.label, r.features,
maxIterations: 2,
loss: loss, onFit: p => pred = p,
advancedSettings: s => s.NumThreads = 1)));
Copy link
Contributor

@artidoro artidoro Jan 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s.NumThreads = 1 [](start = 38, length = 17)

Here too. #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed


In reply to: 248104923 [](ancestors = 248104923)

@@ -940,8 +938,7 @@ public void HogwildSGDBinaryClassification()
var est = reader.MakeNewEstimator()
.Append(r => (r.label, preds: ctx.Trainers.StochasticGradientDescentClassificationTrainer(r.label, r.features,
l2Weight: 0,
onFit: (p) => { pred = p; },
advancedSettings: s => s.NumThreads = 1)));
Copy link
Contributor

@artidoro artidoro Jan 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s.NumThreads = 1 [](start = 43, length = 16)

And here. #Resolved

Contracts.CheckValueOrNull(onFit);

var rec = new TrainerEstimatorReconciler.Regression(
(env, labelName, featuresName, weightsName) =>
Copy link
Member

@sfilipi sfilipi Jan 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

weightsName [](start = 47, length = 11)

this is not getting used. #Pending

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created issue #2175 to look into usages of weights parameter in our public API

For several learners (e.g. SDCA) it seems not necessary


In reply to: 248769861 [](ancestors = 248769861)

/// <param name="advancedSettings">A delegate to set more settings.
/// The settings here will override the ones provided in the direct method signature,
/// if both are present and have different values.
/// The columns names, however need to be provided directly, not through the <paramref name="advancedSettings"/>.</param>
public SdcaRegressionTrainer(IHostEnvironment env,
Copy link
Member

@sfilipi sfilipi Jan 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public SdcaRegressionTrainer(IHos [](start = 5, length = 36)

shall this ctor go away? #Resolved

Copy link
Member Author

@abgoswam abgoswam Jan 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

made internal. deletion will happen in #2100


In reply to: 248770050 [](ancestors = 248770050)

{
Host.CheckNonEmpty(featureColumn, nameof(featureColumn));
Host.CheckNonEmpty(labelColumn, nameof(labelColumn));
_loss = loss ?? Args.LossFunction.CreateComponent(env);
Loss = _loss;
}

internal SdcaRegressionTrainer(IHostEnvironment env, Arguments args, string featureColumn, string labelColumn, string weightColumn = null)
: base(env, args, TrainerUtils.MakeR4ScalarColumn(labelColumn), TrainerUtils.MakeR4ScalarWeightColumn(weightColumn))
internal SdcaRegressionTrainer(IHostEnvironment env, Options options, string featureColumn, string labelColumn, string weightColumn = null)
Copy link
Member

@sfilipi sfilipi Jan 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

internal SdcaRegressionTrainer [](start = 7, length = 31)

I'd remove this one too. #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#2100


In reply to: 248770295 [](ancestors = 248770295)

{
Host.CheckNonEmpty(featureColumn, nameof(featureColumn));
Host.CheckNonEmpty(labelColumn, nameof(labelColumn));
_loss = loss ?? Args.LossFunction.CreateComponent(env);
Loss = _loss;
}

internal SdcaMultiClassTrainer(IHostEnvironment env, Arguments args,
internal SdcaMultiClassTrainer(IHostEnvironment env, Options options,
Copy link
Member

@sfilipi sfilipi Jan 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

internal SdcaMultiClassTrainer [](start = 8, length = 30)

delete, maybe #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#2100


In reply to: 248770420 [](ancestors = 248770420)

/// <param name="advancedSettings">A delegate to set more settings.
/// The settings here will override the ones provided in the direct method signature,
/// if both are present and have different values.
/// The columns names, however need to be provided directly, not through the <paramref name="advancedSettings"/>.</param>
public SdcaMultiClassTrainer(IHostEnvironment env,
Copy link
Member

@sfilipi sfilipi Jan 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public SdcaMultiClassTrainer [](start = 6, length = 30)

same #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

made internal. deletion will happen in #2100


In reply to: 248770465 [](ancestors = 248770465)

{
var args = new TArgs();

// Apply the advanced args, if the user supplied any.
advancedSettings?.Invoke(args);
args.FeatureColumn = featureColumn;
args.LabelColumn = labelColumn.Name;
return args;
}

internal SdcaTrainerBase(IHostEnvironment env, string featureColumn, SchemaShape.Column labelColumn,
Copy link
Member

@sfilipi sfilipi Jan 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

internal SdcaTrainerBase [](start = 6, length = 26)

delete #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#2100


In reply to: 248770895 [](ancestors = 248770895)

@@ -1391,12 +1389,13 @@ public void Add(Double summand)
}
}

public sealed class SdcaBinaryTrainer : SdcaTrainerBase<SdcaBinaryTrainer.Arguments, BinaryPredictionTransformer<TScalarPredictor>, TScalarPredictor>
public sealed class SdcaBinaryTrainer : SdcaTrainerBase<SdcaBinaryTrainer.Options, BinaryPredictionTransformer<TScalarPredictor>, TScalarPredictor>
{
public const string LoadNameValue = "SDCA";
Copy link
Member

@sfilipi sfilipi Jan 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public [](start = 8, length = 6)

all those strings should be internal.. #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there some separate issue for it that I can link to this comment ?

I would like to keep the PRs for public API focused on API rather than fixing issues across the codebase :)


In reply to: 248771027 [](ancestors = 248771027)


}

public SdcaBinaryTrainer(IHostEnvironment env, Arguments args)
Copy link
Member

@sfilipi sfilipi Jan 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public SdcaBinaryTrainer(IHostEnvironment env, Arguments args) [](start = 6, length = 64)

i think this is needed for the SignatureTrainer. Doesn't have to be public. #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the constructor above has this signature. so its not really going away.


In reply to: 248771919 [](ancestors = 248771919)

@@ -1688,53 +1675,50 @@ internal static class Defaults
/// <param name="initLearningRate">The initial learning rate used by SGD.</param>
/// <param name="l2Weight">The L2 regularizer constant.</param>
/// <param name="loss">The loss function to use.</param>
/// <param name="advancedSettings">A delegate to apply all the advanced arguments to the algorithm.</param>
public StochasticGradientDescentClassificationTrainer(IHostEnvironment env,
internal StochasticGradientDescentClassificationTrainer(IHostEnvironment env,
Copy link
Member

@sfilipi sfilipi Jan 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

internal StochasticGradientDescentClassificationTraine [](start = 8, length = 54)

I'd remove all ctors but the one with the (IHostEnvironment env, Arguments) signature, unless is needed for inheritance. #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exactly. thats the point of issue #2100. but that would be a separate PR.

these PRs focus on public API which is P0 (as per consensus in email thread)


In reply to: 248772619 [](ancestors = 248772619)

Loss = _loss;
}

internal SdcaRegressionTrainer(IHostEnvironment env, Arguments args)
: this(env, args, args.FeatureColumn, args.LabelColumn)
internal SdcaRegressionTrainer(IHostEnvironment env, Options options)
Copy link
Member

@singlis singlis Jan 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SdcaRegressionTrainer [](start = 17, length = 21)

isnt there suposed to be a constructor with common arguments and one with options? If we make the other two constructors internal (and eventually remove), then we have just this constructor with the options. Is that OK? Does SDCARegressionTrainer have any common args? #Pending

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not quite sure what you mean.

After #2100, only 1 constructor should remain.


In reply to: 248865494 [](ancestors = 248865494)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that main point with the API changes is that we end up with two constructors that are publicly available to the user:

  • one constructor that would be used for most scenarios and contains a set of arguments that are just passed in Foo(arg1, arg2...)
  • one constructor that only takes in options if we want to set advanced features Foo(options)

Now the first constructor can use the options constructor under the covers, i.e. Foo(arg1, arg2): this(new Options(){ arg1 = arg1, arg2 = arg2 }... but we would still expose two different constructors publicly that can be used.

Is that not the case for SDCA?


In reply to: 248870888 [](ancestors = 248870888,248865494)

Copy link
Member Author

@abgoswam abgoswam Jan 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. We should have two MLContext extensions that are publicly available to the user. This is the public API for ML.NET
  2. All constructors are internal
  3. With Refactoring of Constructors #2100 we will unify the internal constructors. But thats P2 for now.

In reply to: 248880304 [](ancestors = 248880304,248870888,248865494)

internal SdcaBinaryTrainer(IHostEnvironment env, Arguments args,
string featureColumn, string labelColumn, string weightColumn = null)
: base(env, args, TrainerUtils.MakeBoolScalarLabel(labelColumn), TrainerUtils.MakeR4ScalarWeightColumn(weightColumn))
internal SdcaBinaryTrainer(IHostEnvironment env, Options options)
Copy link
Member

@singlis singlis Jan 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment here as I made with the SDCARegression -- does SdcaBinary not have common args or is it always options? #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we will eventually have just 1 constructor (the one with Options)

#2100


In reply to: 248866446 [](ancestors = 248866446)

@@ -1688,53 +1674,50 @@ internal static class Defaults
/// <param name="initLearningRate">The initial learning rate used by SGD.</param>
/// <param name="l2Weight">The L2 regularizer constant.</param>
/// <param name="loss">The loss function to use.</param>
/// <param name="advancedSettings">A delegate to apply all the advanced arguments to the algorithm.</param>
public StochasticGradientDescentClassificationTrainer(IHostEnvironment env,
internal StochasticGradientDescentClassificationTrainer(IHostEnvironment env,
string labelColumn = DefaultColumnNames.Label,
string featureColumn = DefaultColumnNames.Features,
string weightColumn = null,
Copy link
Member

@singlis singlis Jan 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be set to Defaults.WeightColumn? #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to take closer look at how 'weights' are being used, or if they are being used at all

please see #2175.


In reply to: 248867308 [](ancestors = 248867308)

/// <param name="env">The environment to use.</param>
/// <param name="options">Advanced arguments to the algorithm.</param>
internal StochasticGradientDescentClassificationTrainer(IHostEnvironment env, Options options)
: base(env, options.FeatureColumn, TrainerUtils.MakeBoolScalarLabel(options.LabelColumn), options.WeightColumn.IsExplicit ? options.WeightColumn.Value : null)
Copy link
Member

@singlis singlis Jan 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused on why the if check is needed. You are saying that if the WeightColumn is explicitly set use the value, otherwise pass null. If null is used when its not set, then is DefaultColumnNames.Weight needed? If not, could the line above:
_options.WeightColumn = ... : Optional.Implicit(DefaultColumnNames.Weight) to:
_options.WeightColumn = ... : Optional.Implicit(null)

To me the code above does the branching to figure out what parameter to use - its weird to branch afterwards to figure out if its explicit or not.
#Resolved

Copy link
Member Author

@abgoswam abgoswam Jan 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couldn't follow the question properly .

basically without this, some tests fail. also i created #2175 to verify if weights is actually being used at all or not


In reply to: 248871449 [](ancestors = 248871449)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what is happening. So Options.WeightColumn is an Optional and it does default to DefaultColumNames.Weight...
Maybe change this line:
_options.WeightColumn = weightColumn != null ? Optional.Explicit(weightColumn) : Optional.Implicit(DefaultColumnNames.Weight);
to
_options.WeightColumn = weightColumn != null ? Optional.Explicit(weightColumn) ;

Since its setup with a default?


In reply to: 248873304 [](ancestors = 248873304,248871449)

}

/// <summary>
/// Predict a target using a linear binary classification model trained with the SDCA trainer, and log-loss.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor, you can remove the comma for , and log-loss.

/// Predict a target using a linear binary classification model trained with the SDCA trainer, and log-loss.
/// </summary>
/// <param name="ctx">The binary classification context trainer object.</param>
/// <param name="label">The label, or dependent variable.</param>
Copy link
Member

@singlis singlis Jan 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here: label or dependent variable. Same thing with the line below "features or independent weights". Same comment applies to other apis as it looks like this comment was copied.

/// <param name="label">The label, or dependent variable.</param>
/// <param name="features">The features, or independent variables.</param>
/// <param name="weights">The optional example weights.</param>
/// <param name="options">Advanced arguments to the algorithm.</param>
Copy link
Member

@singlis singlis Jan 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

arguments [](start = 43, length = 9)

options? #Pending

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#2154


In reply to: 248872745 [](ancestors = 248872745)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment here -- this is all new code, why not change it here?


In reply to: 248872978 [](ancestors = 248872978,248872745)

SdcaBinaryTrainer.Options options,
Action<LinearBinaryModelParameters, ParameterMixingCalibratedPredictor> onFit = null)
{
Contracts.CheckValue(label, nameof(label));
Copy link
Member

@singlis singlis Jan 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious - this is a documented example right? Is it worth having CheckValues in the example? #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the Static API extension


In reply to: 248873167 [](ancestors = 248873167)

/// from negative to positive infinity), and the predicted label.</returns>
public static (Scalar<float> score, Scalar<bool> predictedLabel) Sdca(
this BinaryClassificationContext.BinaryClassificationTrainers ctx,
Scalar<bool> label, Vector<float> features,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vector [](start = 36, length = 6)

Should features go on a separate line to be consistent with the other args?

/// <param name="label">The name of the label column.</param>
/// <param name="features">The name of the feature column.</param>
/// <param name="weights">The name for the example weight column.</param>
/// <param name="options">Advanced arguments to the algorithm.</param>
Copy link
Member

@singlis singlis Jan 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

arguments [](start = 43, length = 9)

options? #Pending

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#2154


In reply to: 248874445 [](ancestors = 248874445)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this is a new comment/new code. why not update it with this PR?


In reply to: 248874778 [](ancestors = 248874778,248874445)

Copy link
Contributor

@artidoro artidoro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@@ -2488,7 +2488,7 @@ public void TestInputBuilderComponentFactories()
Assert.True(success);
var inputBuilder = new InputBuilder(Env, info.InputType, catalog);

var args = new SdcaBinaryTrainer.Arguments()
var args = new SdcaBinaryTrainer.Options()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

args [](start = 16, length = 4)

rename args to options?

.Append(r => (r.label, score: ctx.Trainers.Sdca(r.label, r.features, maxIterations: 2,
onFit: p => pred = p, advancedSettings: s => s.NumThreads = 1)));
.Append(r => (r.label, score: ctx.Trainers.Sdca(r.label, r.features, null,
new SdcaRegressionTrainer.Options() { MaxIterations = 2, NumThreads = 1 },
Copy link
Member

@singlis singlis Jan 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NumThreads [](start = 73, length = 10)

@ if numthreads should be 1, would it better to default NumThreads in options to 1? #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modifying default settings is out of purview of this PR.

i think we are using NumThreads = 1 only for test cases to be more stable


In reply to: 248875274 [](ancestors = 248875274)

Copy link
Member

@singlis singlis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@singlis
Copy link
Member

singlis commented Jan 17, 2019

Overall looks good - I left some minor feedback.

@abgoswam abgoswam merged commit b65d725 into dotnet:master Jan 18, 2019
@abgoswam abgoswam deleted the abgoswam/action_arguments_3 branch January 21, 2019 21:25
@ghost ghost locked as resolved and limited conversation to collaborators Mar 25, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants