Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Summary function for model summary #1015

Closed
wants to merge 2 commits into from

Conversation

AdarshKumar712
Copy link
Contributor

I have implemented a Summary function to provide a brief model summary, which returns a struct with parameters as a list of trainable layers(with number of parameters in that layer) and the total number of trainable parameters. This function allows to have a summary about the number of parameters used by various layers, and by overall model. Kindly review this and suggest further changes required.

@CarloLucibello
Copy link
Member

I think this is too specific, not every model is a Chain and has a layers field

@AdarshKumar712
Copy link
Contributor Author

AdarshKumar712 commented Feb 10, 2020

Sorry for the delayed reply.
Yes, I agree that this method is bit more specific towards the Chain method right now, as I added it as a additional utility for the Chain method of model declaration. But,I would love to extend the function to other model methods (if possible).
Edit: I also think that this function can be pretty useful in case of using Transfer Learning models using Metalhead as they are mostly based on Chain model and layer based analysis might be really helpful.
@CarloLucibello What are your views in this respect?

Helper function for printing the summary of the model in a proper format.
"""
function print_summary(summ)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do end up with a struct for the summary, it's better to instead overload the show method here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although what advantage does a struct have here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While discussing about it on slack, someone suggested that it would be better to provide the user with the information collected through the function and let him decide what to do with the information like to analyze the no. of parameters in certain fixed limit of layers. That was the sole reason to use a struct here.

# Summary of the model #
############################################################################
struct Trainable_layer
layer_name::Any
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please follow the same style as the rest of the code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I apologize for that, this was my first PR in Flux. So I wasn't much accustomed with the style that time. For future, I will keep that in mind

"""
function Summary(model)
layers_vec =[]
layers = model.layers
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably don't need to maintain a copy of existing information. Layers with no params can probably be ignored, although a placeholder suggesting the existence of one would be good

@DhairyaLGandhi
Copy link
Member

Important to note that the notion of what a model consists of is very grey. Many layers for us might be regular functions and closures, and while I'd like to see all the parameters in a model, as mentioned earlier, I might not only not have a Chain, the parameters themselves could be referencing arbitrary objects, but that might be too hard to consider in this PR

@AdarshKumar712
Copy link
Contributor Author

AdarshKumar712 commented Feb 25, 2020

Yes, that's one of the problem here, as how a model is build up, that's not well established beforehand. So, this method can be applied to Chain model only as in other types of model methods the information of layers might not be that easily accessible. As I have found most of the pre-trained models on Metalhead are Chain based, I thought, this method might be useful there. Moreover, I think this could be better defined as a utility to the Chain based models. I wanted to implement this as something similar to Keras Summary function for Sequential based models.
Also for layers defined as regular functions and closures, I found that many times, some of them are in params(model) method. So their parameters are actually not included in the parameter training
Like for example:

julia> m = Chain(Dense(10,2),x->Dense(2,1)(x))
Chain(Dense(10, 2), #3)

julia> params(m.layers[2])
Params([])

However, for some of these layers defined in Chain, as you suggested above, placeholder can be added to suggest their presence. Furthermore, I am also interested to find, if the scope of the function can be extended to other methods as well. Is there any way through which the information about generic functions methods can be extracted?

@mcabbott
Copy link
Member

This is handled by show now

@mcabbott mcabbott closed this Oct 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants