-
-
Notifications
You must be signed in to change notification settings - Fork 608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added Summary function for model summary #1015
Conversation
I think this is too specific, not every model is a Chain and has a |
Sorry for the delayed reply. |
Helper function for printing the summary of the model in a proper format. | ||
""" | ||
function print_summary(summ) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do end up with a struct for the summary, it's better to instead overload the show
method here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although what advantage does a struct have here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While discussing about it on slack, someone suggested that it would be better to provide the user with the information collected through the function and let him decide what to do with the information like to analyze the no. of parameters in certain fixed limit of layers. That was the sole reason to use a struct here.
# Summary of the model # | ||
############################################################################ | ||
struct Trainable_layer | ||
layer_name::Any |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please follow the same style as the rest of the code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I apologize for that, this was my first PR in Flux. So I wasn't much accustomed with the style that time. For future, I will keep that in mind
""" | ||
function Summary(model) | ||
layers_vec =[] | ||
layers = model.layers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably don't need to maintain a copy of existing information. Layers with no params can probably be ignored, although a placeholder suggesting the existence of one would be good
Important to note that the notion of what a model consists of is very grey. Many layers for us might be regular functions and closures, and while I'd like to see all the parameters in a model, as mentioned earlier, I might not only not have a Chain, the parameters themselves could be referencing arbitrary objects, but that might be too hard to consider in this PR |
Yes, that's one of the problem here, as how a model is build up, that's not well established beforehand. So, this method can be applied to Chain model only as in other types of model methods the information of layers might not be that easily accessible. As I have found most of the pre-trained models on Metalhead are Chain based, I thought, this method might be useful there. Moreover, I think this could be better defined as a utility to the Chain based models. I wanted to implement this as something similar to Keras Summary function for Sequential based models.
However, for some of these layers defined in Chain, as you suggested above, placeholder can be added to suggest their presence. Furthermore, I am also interested to find, if the scope of the function can be extended to other methods as well. Is there any way through which the information about generic functions methods can be extracted? |
This is handled by |
I have implemented a Summary function to provide a brief model summary, which returns a struct with parameters as a list of trainable layers(with number of parameters in that layer) and the total number of trainable parameters. This function allows to have a summary about the number of parameters used by various layers, and by overall model. Kindly review this and suggest further changes required.