Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make TextFeaturizer's extractors etc. configurable again #838

Closed
Zruty0 opened this issue Sep 5, 2018 · 2 comments
Closed

Make TextFeaturizer's extractors etc. configurable again #838

Zruty0 opened this issue Sep 5, 2018 · 2 comments
Assignees
Labels
API Issues pertaining the friendly API
Milestone

Comments

@Zruty0
Copy link
Contributor

Zruty0 commented Sep 5, 2018

#801 is making the parameters default and hardcoded for the following options of Text featurizer:

Once individual building blocks become estimators, we should bring these parameters back (in a form of estimator for word/char extractor etc.).

Or maybe we shouldn't, and instead just demonstrate how to compose your version of text transform from the individual building blocks?

@Zruty0 Zruty0 added the API Issues pertaining the friendly API label Sep 5, 2018
@shauheen shauheen added this to the 0918 milestone Sep 5, 2018
@justinormont
Copy link
Contributor

The text transform is an incredible time savings and lowers the bar to entry for making a good NLP model.

Building from the individual blocks is a rough road to travel, and doesn't add much extra power. The only case I recall needing to use the individual blocks was when using the lemmatizer, which I don't think is available in ML.NET.

Also, I'd recommend Bigrams+Trichar as the defaults, which matches our default text recipe.

@eerhardt
Copy link
Member

eerhardt commented Mar 2, 2019

Is this strictly adding new API? Can this be done without a public API breaking change? If so, I think we can remove it from Project 13, and it can be added after v1.0.

But if this requires a public API breaking change, then it can be left in Project 13.

@zeahmed zeahmed self-assigned this Mar 4, 2019
@shauheen shauheen added this to the 0319 milestone Mar 5, 2019
@zeahmed zeahmed closed this as completed Mar 21, 2019
@ghost ghost locked as resolved and limited conversation to collaborators Mar 29, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
API Issues pertaining the friendly API
Projects
None yet
Development

No branches or pull requests

5 participants