Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creation of components through MLContext and cleanup (Onnx, Tensorflow, SelectColumn, KeytoBinVec, ValueMap) #2367

Merged
merged 14 commits into from
Feb 7, 2019

Conversation

artidoro
Copy link
Contributor

@artidoro artidoro commented Feb 1, 2019

This PR is part of the work outlined in #1798, and focuses on the GcnNorm, LpNorm, RandomFourier, CustomStopWords, VectorWhiten, PCA transformers/estimators:

  1. Internalize constructors of estimators and transformers
  2. Keep ColumnInfo and move it to to the estimators The ColumnInfo structure should live in the estimators, rather than transformers #1760
  3. Rename Arguments -> Options
  4. Internalize Options only when they are not used by public constructor. For all other cases, retain Options as public Arguments class should be made internal when possible #1758

This PR is marked as WIP because I have to figure out how to make the DNN assemblies BestFriends with the ONNX assembly. I need to figure out how to do that.

@artidoro artidoro changed the title Creation of components through MLContext and cleanup (Onnx, Tensorflow, SelectColumn, KeytoBinVec, ValueMap) WIP: Creation of components through MLContext and cleanup (Onnx, Tensorflow, SelectColumn, KeytoBinVec, ValueMap) Feb 1, 2019
@artidoro artidoro self-assigned this Feb 1, 2019
@artidoro artidoro added the API Issues pertaining the friendly API label Feb 1, 2019
@artidoro artidoro changed the title WIP: Creation of components through MLContext and cleanup (Onnx, Tensorflow, SelectColumn, KeytoBinVec, ValueMap) Creation of components through MLContext and cleanup (Onnx, Tensorflow, SelectColumn, KeytoBinVec, ValueMap) Feb 3, 2019
@artidoro artidoro force-pushed the round5 branch 2 times, most recently from b1dfe49 to 50505e3 Compare February 3, 2019 22:21
@codecov
Copy link

codecov bot commented Feb 4, 2019

Codecov Report

Merging #2367 into master will decrease coverage by 0.01%.
The diff coverage is 80%.

@@            Coverage Diff             @@
##           master    #2367      +/-   ##
==========================================
- Coverage   71.22%   71.21%   -0.02%     
==========================================
  Files         785      785              
  Lines      140977   140954      -23     
  Branches    16116    16113       -3     
==========================================
- Hits       100412   100379      -33     
- Misses      36096    36109      +13     
+ Partials     4469     4466       -3
Flag Coverage Δ
#Debug 71.21% <80%> (-0.02%) ⬇️
#production 67.57% <78.42%> (-0.02%) ⬇️
#test 85.28% <85%> (-0.01%) ⬇️

Copy link
Member

@singlis singlis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@singlis
Copy link
Member

singlis commented Feb 5, 2019

overall it looks good to me - I left comments about the CreateKeep/CreateDrop #Resolved

@artidoro
Copy link
Contributor Author

artidoro commented Feb 5, 2019

Thank you very much for reviewing @singlis! I will address your comments. #Resolved

@artidoro artidoro force-pushed the round5 branch 3 times, most recently from f202c0f to 0b1554b Compare February 6, 2019 03:19
@@ -29,38 +29,18 @@

namespace Microsoft.ML.Transforms.Conversions
{
/// <summary>
/// Converts the key types back to binary verctors.
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Feb 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

verctors [](start = 47, length = 8)

verctors [](start = 47, length = 8)

typeo #Closed

/// <example>
/// <format type="text/markdown">
/// <![CDATA[
/// [!code-csharp[ScoreTensorFlowModel](~/../docs/samples/docs/samples/Microsoft.ML.Samples/Dynamic/TensorFlowTransform.cs)]
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Feb 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TensorFlowTransform [](start = 108, length = 19)

TensorFlowTransform [](start = 108, length = 19)

Zeeshan A is trying to rename this file, so be careful
#2429 #Closed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please merge with master
#2429 got merge and this file got moved.


In reply to: 254137918 [](ancestors = 254137918)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just rebased again, should be good now.


In reply to: 254514215 [](ancestors = 254514215,254137918)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you push your changes? because I still see Dynamic/TensorFlowTransform.cs instead of TensorFlow/ImageClassification.cs


In reply to: 254515347 [](ancestors = 254515347,254514215,254137918)

Copy link
Contributor Author

@artidoro artidoro Feb 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am sorry I think I did not rebase earlier. I now rebased and pushed again and I see Zeeshan's change now. #Resolved

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still see in line 24 link to /// [!code-csharp[ScoreTensorFlowModel](~/../docs/samples/docs/samples/Microsoft.ML.Samples/Dynamic/TensorFlowTransform.cs)]

And can you please stop using your rebase magic? Codeflow continue to stay on same iteration, and it get's hard to track which files get changed and which are not?


In reply to: 254547579 [](ancestors = 254547579)

Copy link
Contributor Author

@artidoro artidoro Feb 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah now I understood what you meant! the links will be broken sorry!! Fixing right away #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok now I think I did what you were saying from the beginning, I didn't understand you were talking about the link. Just pushed a new iteration.

/// <param name="catalog">The transform's catalog.</param>
/// <param name="modelLocation">Location of the TensorFlow model.</param>
/// <param name="inputColumnName"> The name of the model inputs.</param>
/// <param name="outputColumnName">The name of the requested model outputs.</param>
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Feb 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

last word in each sentence is plural. should be singular. input/output instead of inputs/outputs. #Resolved

/// Scores a dataset using a pre-traiend TensorFlow model specified via <paramref name="tensorFlowModel"/>.
/// </summary>
/// <param name="catalog">The transform's catalog.</param>
/// <param name="tensorFlowModel">The pre-trained TensorFlow model.</param>
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Feb 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pre-trained TensorFlow model. [](start = 42, length = 33)

The pre-trained TensorFlow model. [](start = 42, length = 33)

I would call it pre-loaded instead of pre-trained. They all pre-trained I believe, but this one you load manually in memory, and for other you just pass path to model. #Closed

@artidoro
Copy link
Contributor Author

artidoro commented Feb 6, 2019

public static class SkipFilter

Yes you are right! Thank you :)


In reply to: 461110889 [](ancestors = 461110889)


Refers to: src/Microsoft.ML.Data/Transforms/SkipTakeFilter.cs:269 in 1fd3ec2. [](commit_id = 1fd3ec2, deletion_comment = False)

@artidoro
Copy link
Contributor Author

artidoro commented Feb 7, 2019

    public static ValueMappingEstimator<TInputType, TOutputType> ValueMap<TInputType, TOutputType>(

I would open an issue about it and discuss there if you want to change it


In reply to: 461162002 [](ancestors = 461162002)


Refers to: src/Microsoft.ML.Data/Transforms/ConversionsExtensionsCatalog.cs:172 in 1fd3ec2. [](commit_id = 1fd3ec2, deletion_comment = False)

@artidoro artidoro force-pushed the round5 branch 3 times, most recently from e5ddd64 to 15c2938 Compare February 7, 2019 04:55
@@ -48,7 +55,7 @@ public sealed class ColumnSelectingEstimator : TrivialEstimator<ColumnSelectingT
/// <param name="env">Instance of the host environment.</param>
/// <param name="keepColumns">The array of column names to keep.</param>
private ColumnSelectingEstimator(IHostEnvironment env, params string[] keepColumns)
Copy link
Contributor

@zeahmed zeahmed Feb 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

private [](start = 8, length = 7)

I don't see any use of this constructor except for being used at line 90. Since you are cleaning up the code as well, I would suggest removing it. #Resolved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! I removed that


In reply to: 254552460 [](ancestors = 254552460)

/// <param name="tensorFlowModel">The pre-loaded TensorFlow model.</param>
/// <param name="inputColumnName"> The name of the model input.</param>
/// <param name="outputColumnName">The name of the requested model output.</param>
public static TensorFlowEstimator ScoreTensorFlowModel(this TransformsCatalog catalog,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding the example that was linked below.

/// Applies a pre-trained Onnx model.
/// All column names are provided, the input data column names/types must exactly match
/// all model input names. All possible output columns are then generated, with names/types
/// specified by the model.
/// </summary>
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Feb 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can put this one in <remark> section #Resolved

=> new OnnxScoringEstimator(CatalogUtils.GetEnvironment(catalog), transformer);
/// <param name="modelFile">The path of the file containing the ONNX model.</param>
/// <param name="outputColumnName">The input column.</param>
/// <param name="inputColumnName">The output column resulting from the transformation.</param>
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Feb 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you change their description? you put input column for outputColumnName and vice versa. #Resolved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that was bad thank you!


In reply to: 254794567 [](ancestors = 254794567)

/// <param name="catalog">The transform's catalog.</param>
/// <param name="modelFile">The path of the file containing the ONNX model.</param>
/// <param name="outputColumnNames">The input columns.</param>
/// <param name="inputColumnNames">The output columns resulting from the transformation.</param>
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Feb 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you change their description? you put input columns for outputColumnNames and vice versa. #Resolved

Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@artidoro artidoro merged commit 30fe08a into dotnet:master Feb 7, 2019
@artidoro artidoro deleted the round5 branch March 13, 2019 17:55
@ghost ghost locked as resolved and limited conversation to collaborators Mar 24, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
API Issues pertaining the friendly API
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants