Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add LDA example to Microsoft.ML.Samples #1782

Merged
merged 2 commits into from
Nov 29, 2018

Conversation

abgoswam
Copy link
Member

@sfilipi sfilipi added the documentation Related to documentation of ML.NET label Nov 29, 2018
var transformed_data = transformer.Transform(trainData);

// Small helper to print the text inside the columns, in the console.
Action<string, IEnumerable<VBuffer<float>>> printHelper = (columnName, column) =>
Copy link
Member

@sfilipi sfilipi Nov 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

printHelper [](start = 56, length = 11)

you don't have to do this, if you'll use it only once. #Resolved

// A pipeline for featurizing the "Review" column
string ldaFeatures = "LdaFeatures";
var pipeline = ml.Transforms.Text.ProduceWordBags("Review").
Append(ml.Transforms.Text.LatentDirichletAllocation("Review", ldaFeatures, numTopic:3));
Copy link
Member

@sfilipi sfilipi Nov 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LatentDirichletAllocation [](start = 42, length = 25)

just asking: besides numTopic, is there other params that the user might tune often? #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i doubt users tune other parameters besides numTopic


In reply to: 237660759 [](ancestors = 237660759)

@sfilipi
Copy link
Member

sfilipi commented Nov 29, 2018

    public static LatentDirichletAllocationEstimator LatentDirichletAllocation(this TransformsCatalog.TextTransforms catalog, params LatentDirichletAllocationTransformer.ColumnInfo[] columns)

maybe add the same sample one more time, and adapt to use this one, so this gets its example as well? #Pending


Refers to: src/Microsoft.ML.Transforms/Text/TextCatalog.cs:540 in cbd8ad3. [](commit_id = cbd8ad3, deletion_comment = False)

Copy link
Member

@sfilipi sfilipi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:


// A pipeline for featurizing the "Review" column
string ldaFeatures = "LdaFeatures";
var pipeline = ml.Transforms.Text.ProduceWordBags("Review").
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Nov 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Review" [](start = 62, length = 8)

nameof(SamplesUtils.DatasetUtils.SampleTopicsData.Review) ? #Resolved

Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@abgoswam
Copy link
Member Author

    public static LatentDirichletAllocationEstimator LatentDirichletAllocation(this TransformsCatalog.TextTransforms catalog, params LatentDirichletAllocationTransformer.ColumnInfo[] columns)

none of the examples within Microsoft.Ml.Samples showcase this paradigm .. Is it to keep things simple ?

I am skipping this for now.


In reply to: 442996256 [](ancestors = 442996256)


Refers to: src/Microsoft.ML.Transforms/Text/TextCatalog.cs:540 in cbd8ad3. [](commit_id = cbd8ad3, deletion_comment = False)

@abgoswam abgoswam merged commit 8022c4f into dotnet:master Nov 29, 2018
@abgoswam abgoswam deleted the abgoswam/LDA_Sample branch January 13, 2019 18:12
@ghost ghost locked as resolved and limited conversation to collaborators Mar 26, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
documentation Related to documentation of ML.NET
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants