-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mass rename of transformers and MLContext extensions for them #1318
Comments
What types of sub-namespaces are we looking for? For example would we want categories like { Have we publicly defined the term trainable? Tersely, trainable components take a pass of the data and learns from it, then after being trained it can used. Examples of this include: text featurization (where we take a pass to index the words), label encoding (where we take a pass to index the labels), normalizers (to learn the range of the numbers), and feature selection (to count/learn which features to keep). |
If they are trainable, they should NOT have a public constructor. ¯_(ツ)_/¯ |
I updated the comment. @justinormont , yes on the namespaces. And the explanation of trainable is also accurate, thanks. |
@Zruty0 one more question, should the transforms currently living in Microsoft.ML.Runtime.Data move to Microsoft.ML.Transforms? |
They should move out of 'Runtime'. If there is some form of inherent grouping, it may or may not be reflected in sub-namespace, but the root should be |
One thing I wonder if we should make explicit, is if a |
That was the intent, yes. |
Do we have candidate list of the new names?
Is there a public place where this list can be grown? A wiki like interface could be suitable. |
@justinormont , I don't think we need to produce a separate list. The old names are sort of incidental, so there is no value in preserving them for posterity. The new names will be reflected in the documentation. The renaming itself is abundantly visible on the pull request. |
@eerhardt suggested that we also put the transformers and estimators in subfolders based on sub-categories. |
We should not forget to rename the transformers. For example in v0.7, ValueToKeyMappingEstimator returns a TermTransform on .fit(). And when you need to provide arguments as ColumInfo[], it looks like: machinelearning/test/Microsoft.ML.Tests/Transformers/KeyToBinaryVectorEstimatorTest.cs Lines 50 to 54 in 9d33efe
Which is not ideal, since estimator name does not match transformer name. |
In addition to the internal estimator classes, for the MLContext catalog, I'd like to highlight that many methods creating estimators in the new MLContext catalog are named in a way that look like properties, with a noun instead of having a verb, since they are methods. According to C# conventions (and most languages), a method's name should have a verb describing the action being performed by that method: For example, the following are current methods creating objects:
(In particular, the "TextReader" method since it is creating a TextLoader, it should also be renamed to TextLoader as part of the method's name, but that is a different/particular issue). And all the methods for creating trainers from the MLContext catalog, such as:
When you see that code the first time, due to the fact that the method's name is a noun, it feels like a Property object, but it is not, they are methods. I think those methods should be named as something like:
So it feels like methods not as properties.
Those objects are not normalizing or concatenating something in that moment within the object owning the method. In reality, they are creating an object, so again they probably should be named with a verb related to that object "Creation" of a specific type. Something like:
What I'm proposing is what is aligned to standard C# naming conventions and what C# developers are used to. Especially having methods with just a noun feels like a Property object instead of a method.. Thoughts? |
Let's go over all the existing transforms and make sure that they share the same naming conventions:
Estimators
ActionPerformingEstimator
, orAlgorithmNameTrainer
Microsoft.ML.Trainers
,Transforms
or sub-namespaces if applicableTrainers
Transformers
ActionPerformingTransformer
Microsoft.Ml.Transforms
or sub-namespacesThe text was updated successfully, but these errors were encountered: