Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documenting Design Patterns #891

Open
MikeInnes opened this issue Oct 11, 2019 · 17 comments
Open

Documenting Design Patterns #891

MikeInnes opened this issue Oct 11, 2019 · 17 comments

Comments

@MikeInnes
Copy link
Member

MikeInnes commented Oct 11, 2019

A lot of what Flux can do is not explicitly written down. Regularisation is a good example; just grabbing your parameters and summing them is really simple and intuitive, but if you're used to frameworks that provide an explicit API for this, you might get not think of it, or assume it's not supported at all if it isn't in the docs.

So we need to document Flux "design patterns" that explicitly cover features from other frameworks. Some things off the top of my head:

Ideas (or requests) for other features that we should document how to do are welcome.

@DhairyaLGandhi
Copy link
Member

Writing custom adjoints and intuition behind it. Maybe even a section on arbitrary code semantics (find, handling workers etc)

@MikeInnes
Copy link
Member Author

Agreed, we definitely need to replace the backprop section that used to cover tracker. We don't want to duplicate all the Zygote docs, but some explanation in the context of Flux would be really helpful.

@jumerckx
Copy link

jumerckx commented Oct 12, 2019

I'd love some explanation on mutation in a model using Buffer.

@scheidan
Copy link
Contributor

An example of gradient clipping would be good too.

@MikeInnes
Copy link
Member Author

@scheidan I think we should add some clipping layers (#672) but yes, giving people the know-how to do it themselves more generally is also a good idea of course.

@merckxiaan Is there anything Flux-specific we should say about Buffer? It's probably worth at least mentioning, a long with a few other more advanced tricks that are documented by Zygote.

@jumerckx
Copy link

I haven't got anything specific in mind but it'd be interesting to see a simple deep learning network that uses Buffers.
I'm sure Buffers are really helpful to make efficient models but I still don't really get how things can be achieved when mathematical operations are not permitted on them?

@MikeInnes
Copy link
Member Author

Buffers are really just meant as a workaround when you want to do array construction using mutation. So you might use them inside the definition of a basic array op like cat, but they'd be completely transient; you wouldn't pass Buffers through a deep learning model.

@janEbert
Copy link
Contributor

Advanced Gradients (link Zygote docs)

Adding to that, not tracking parameters should be included there (or somewhere else) as well.
You mentioned Zygote.dropgrads.

@MikeInnes
Copy link
Member Author

Yes, that definitely also fits under "things we should have APIs for" too; added to the list.

@appleparan
Copy link

appleparan commented Nov 9, 2019

How about writing tutorial how to port tutorial codes from Tensorflow or Pytorch docs? Many other ML codes are written based on TF or torch and sometimes I was very confusing about how to convert them. For me, input set structure is most unfamiliar things compared to other frameworks. Without model-zoo, it is hard to use Flux.jl by reading its docs only. I think comparing other framework codes with Flux by 1vs1 would be helpful.

@MikeInnes
Copy link
Member Author

Good idea. I think we could add "Flux for TF users" or "Flux for PyTorch users" guides as sections in the docs. Happy to help anyone who wants to contribute that.

@RohitMazumder
Copy link

If no one is already working on it, then I would love to contribute on "Flux for TF users" !

@MikeInnes
Copy link
Member Author

That would be great!

@cossio
Copy link
Contributor

cossio commented Jan 13, 2020

Automatic parameter extraction for custom types, via @functor.

E.g., add this example from @dhairyagandhi96 to the docs, which shows how you can control what fields are added to the parameter tree:

julia> struct MyLayer{T,K}
         a::T
         b::K
       end
julia> using Flux: @functor
julia> @functor MyLayer (a,)
julia> _l = MyLayer(rand(3,3), rand(5))
MyLayer{Array{Float64,2},Array{Float64,1}}([0.8382666790752258 0.16279958328830713 0.8185255499278947; 0.10188918358486099 0.7421499443512403 0.7912103198124705; 0.7105677086316595 0.16360615883658625 0.5766784867701418], [0.7118888831680539, 0.3682507168932143, 0.18493277328287605, 0.025627816691143, 0.6084281600097385])
julia> Flux.params(_l)
Params([[0.8382666790752258 0.16279958328830713 0.8185255499278947; 0.10188918358486099 0.7421499443512403 0.7912103198124705; 0.7105677086316595 0.16360615883658625 0.5766784867701418]])
julia> size.(Flux.params(_l))
1-element Array{Tuple{Int64,Int64},1}:
 (3, 3)

@DhairyaLGandhi
Copy link
Member

Adding transfer learning examples would be good too

@DhairyaLGandhi
Copy link
Member

Data loading + viz and TensorboardLogger.jl integration

@ToucheSir
Copy link
Member

Thoughts on breaking this into smaller issues and creating a project board for them? I know some (e.g. RNNs, ecosystem) have been addressed already.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants