-
Notifications
You must be signed in to change notification settings - Fork 287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SparK Implementation #1750
base: master
Are you sure you want to change the base?
SparK Implementation #1750
Conversation
Thanks a lot for the PR! The review might take a while due to the upcoming holiday season. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again for the PR! This will take a while to get merged as it contains many changes. I left a lot of comments/questions regarding the various parts. Let us know if you need help, we can also support you in making the changes or take over parts of the PR if that helps.
SparseMaxPooling, | ||
SparseSyncBatchNorm2d, | ||
SparseConvNeXtBlock, | ||
SparseConvNeXtLayerNorm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SparseConvNeXtLayerNorm
is imported but not used
) | ||
|
||
|
||
class SparseConvNeXtBlock(nn.Module): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the difference between this ConvNeXt implementation and the ones from torchvision and timm? Would it be possible to inherit from torchvisions CNBlock
instead as it is used in dense_model_to_sparse
? This way we wouldn't have to implement DropPath
ourselves.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was initially opting to not rely on Timm, but I'm fine with working with Timm for simplicity. Will change!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move code in this file to lightly/models/modules/spark_sparse.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move the code in this file to lighlty/models/modules/spark_sparse.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not add this emtpy file
Thanks for all the comments! I'll make some changes later today and I'll let you know how much help I'll need given the scope of this project. |
This is a WIP implementation of the SparK algorithm, as requested by #1462