Using ScoreTensorFlow model is a bit confusing #2165

JakeRadMSFT · 2019-01-16T20:51:32Z

System information

OS version/distro: Windows
.NET Version (eg., dotnet --info): 0.8.0

Issue

What did you do?
See code below. I'm loading a pre-trained TensorFlow model and I was working from the existing examples.
What happened?
I didn't understand why the example was passing training/test data to get the prediction function (see code and comment below).
What did you expect?
It seems like I should just be able create a pipeline with pre-processing steps and ScoreTensorFlowModel and then just get the predict function. To test this theory I tried making MulitFileSource(null) and everything works fine. If it's not needed ... Can you recommend different code? If it is needed ... it seems kind of odd.

Source code / logs

            var loader = new TextLoader(mlContext,
                new TextLoader.Arguments
                {
                    Column = new[] {
                        new TextLoader.Column("ImagePath", DataKind.Text, 0),
                    }
                });

            // Why is this needed? It works fine with the MultiFileSource being null. There shouldn't need to be training data when loading a pre-trained model.
            var data = loader.Read(new MultiFileSource(null));

            var pipeline = mlContext.Transforms.LoadImages(imageFolder: imageFolderPath, columns: ("ImagePath", "ImageReal"))
                            .Append(mlContext.Transforms.Resize("ImageReal", "ImageReal", ImagePreprocessSettings.imageHeight, ImagePreprocessSettings.imageWidth))
                            .Append(mlContext.Transforms.ExtractPixels(new[] { new ImagePixelExtractorTransform.ColumnInfo("ImageReal", TensorFlowModelSettings.InputTensorName, interleave: ImagePreprocessSettings.channelsLast, offset: ImagePreprocessSettings.mean) }))
                            .Append(mlContext.Transforms.ScoreTensorFlowModel(modelFilePath, new[] { TensorFlowModelSettings.InputTensorName }, new[] { TensorFlowModelSettings.OuputTensorName }));
            
            // What am I "fitting" and why am I passing "data"?
            var modeld = pipeline.Fit(data);

            var predictionFunction = modeld.MakePredictionFunction<TrainTestData, PredictionProbability>(mlContext);

            return predictionFunction;

The text was updated successfully, but these errors were encountered:

Ivanidzo4ka · 2019-01-31T00:24:56Z

So we have concept of IEstimator and ITransformer. see https://github.com/dotnet/machinelearning/blob/master/docs/code/MlNetHighLevelConcepts.md
In general case you create pipeline out of chaining IEstimators together, then you fit them, and that produce TransformerChain.
Since all your estimators are trivial (doesn't require pass through data to create internal state) you can chain them together without any fitting.

 var model = new ImageLoaderTransformer(mlContext, imageFolder: imageFolderPath, columns: ("ImagePath", "ImageReal"))
            .Append(new ImageResizerTransformer(mlContext, "ImageReal", "ImageReal", ImagePreprocessSettings.imageHeight, ImagePreprocessSettings.imageWidth))
            .Append(new ImagePixelExtractorTransformer(mlContext, new[] { new ImagePixelExtractorTransformer.ColumnInfo("ImageReal", TensorFlowModelSettings.InputTensorName, interleave: ImagePreprocessSettings.channelsLast, offset: ImagePreprocessSettings.mean) }))
            .Append(new TensorFlowTransformer(mlContext, modelFilePath, new[] { TensorFlowModelSettings.InputTensorName }, new[] { TensorFlowModelSettings.OuputTensorName }));

But you can do that trick only if all your pipeline estimators independent from data. (One of the easiest ways to tell is to check your estimator and what class it will return in case of fit and if you can see public constructor for that class - it doesn't required training).

Ivanidzo4ka · 2019-01-31T00:26:39Z

Sorry for delay in response, hope I answer your question.

JakeRadMSFT · 2019-02-05T23:26:03Z

Yep! I couldn't get my code to format correctly to post the final solution here but it's essentially what you have above but switching input/output column locations. I also had to update to 0.10.0.

I also had to call model.CreatePredictionEngine<TrainTestData, PredictionProbability>(mlContext); to get the equivalent predict function.

Thanks!

Ivanidzo4ka · 2019-02-05T23:34:53Z

One thing worth to mention, we currently in process of hiding as much stuff as possible and in 0.11 you wont be able to do this trick since Transformer constructor will stop being public. We aware what it's weird to call estimators on top of empty file, and we will address it, there is reference to proposal how to do that above. We just in a state if we expose too much, it can be nightmare to support later, so we trying to hide pretty much everything from user.

JakeRadMSFT · 2019-02-05T23:49:41Z

I'm not fully following. So what will it become? I find this to be fairly intuitive.

var model = new ImageLoaderTransformer(mlContext, imageFolder: imageFolderPath, columns: ("ImageReal", "ImagePath"))
            .Append(new ImageResizerTransformer(mlContext, "ImageReal", ImagePreprocessSettings.imageWidth, ImagePreprocessSettings.imageHeight, "ImageReal"))
            .Append(new ImagePixelExtractorTransformer(mlContext, new[] { new ImagePixelExtractorTransformer.ColumnInfo(TensorFlowModelSettings.InputTensorName, "ImageReal", interleave: ImagePreprocessSettings.channelsLast, offset: ImagePreprocessSettings.mean) }))
            .Append(new TensorFlowTransformer(mlContext, modelFilePath, new[] { TensorFlowModelSettings.OuputTensorName }, new[] { TensorFlowModelSettings.InputTensorName }));   
                        
            return model.CreatePredictionEngine<TrainTestData, PredictionProbability>(mlContext);

JakeRadMSFT · 2019-02-06T00:21:37Z

Re-opening to continue the discussion :)

Ivanidzo4ka · 2019-02-06T03:38:48Z

#1798 (comment)
So we have this issue/discussion where we decided to hide all ctors for transformers and estimators.

From what I understand (I wasn't part of discussion) we have mlContext object and we want user instead of going through namespaces and documentation and search engine just call for mlContext.Transforms and show all available estimators for him.

Somehow we also decided what any other way to construct estimators is should be hidden, and same goes to trivial (not required training) transformers. I wasn't part of discussion, so I don't know reasoning behind it, but I'm sure it exist.

We have this issue: #2354, but I'm not sure it will save you from fake Fit call. At least I don't see that in issue.

JakeRadMSFT · 2019-03-20T00:17:49Z

Any updates on this?

@Ivanidzo4ka, have you heard/seen anything?

nfnpmc · 2019-08-18T13:26:30Z

What is it supposed to be like in 1.1.0?

harishsk · 2020-01-10T02:28:36Z

@JakeRadMSFT and @nfnpmc I am not not sure I follow your latest questions but it appears the previous questions have been answered. I am closing the issue for now. Please reopen with details if you need more info.

artidoro mentioned this issue Jan 31, 2019

Trivial estimators should also be ITransformer #2354

Open

JakeRadMSFT closed this as completed Feb 5, 2019

JakeRadMSFT reopened this Feb 6, 2019

Ivanidzo4ka mentioned this issue Feb 6, 2019

Creation of components through MLContext: advanced options and other feedback #1798

Closed

harishsk closed this as completed Jan 10, 2020

ghost locked as resolved and limited conversation to collaborators Mar 25, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using ScoreTensorFlow model is a bit confusing #2165

Using ScoreTensorFlow model is a bit confusing #2165

JakeRadMSFT commented Jan 16, 2019

Ivanidzo4ka commented Jan 31, 2019 •

edited

Loading

Ivanidzo4ka commented Jan 31, 2019

JakeRadMSFT commented Feb 5, 2019 •

edited

Loading

Ivanidzo4ka commented Feb 5, 2019

JakeRadMSFT commented Feb 5, 2019

JakeRadMSFT commented Feb 6, 2019

Ivanidzo4ka commented Feb 6, 2019

JakeRadMSFT commented Mar 20, 2019

nfnpmc commented Aug 18, 2019

harishsk commented Jan 10, 2020

Using ScoreTensorFlow model is a bit confusing #2165

Using ScoreTensorFlow model is a bit confusing #2165

Comments

JakeRadMSFT commented Jan 16, 2019

System information

Issue

Source code / logs

Ivanidzo4ka commented Jan 31, 2019 • edited Loading

Ivanidzo4ka commented Jan 31, 2019

JakeRadMSFT commented Feb 5, 2019 • edited Loading

Ivanidzo4ka commented Feb 5, 2019

JakeRadMSFT commented Feb 5, 2019

JakeRadMSFT commented Feb 6, 2019

Ivanidzo4ka commented Feb 6, 2019

JakeRadMSFT commented Mar 20, 2019

nfnpmc commented Aug 18, 2019

harishsk commented Jan 10, 2020

Ivanidzo4ka commented Jan 31, 2019 •

edited

Loading

JakeRadMSFT commented Feb 5, 2019 •

edited

Loading