-
-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refresh Model Zoo #9
Comments
What is the overall vision for the model zoo? Like what attributes would make a good zoo model? I'm going to throw out some things and let me know if they are what you have in mind:
I read someone somewhere else said that they thought getting a working model zoo up to best practices would set the stage for growing the ML/DL/Flux ecosystem because then it's easier to benchmark and progressively improve. |
As someone newer to the ecosystem and Julia, it would probably be good to have a list of best practices when implementing a model - a checklist with explanations. It would promote consistency for the model zoo models and be a good jumping point for new people in the community. There's a PR for something about putting things in the global scope, but someone new from python won't necessarily know not to do that for their model. I think there is also likely an opportunity to improve these pages as well, specifically the performance one: |
As for my first post above, I just found Literate.jl, which I think could be a solution to having to maintain 1 |
These are great comments. I'll try and give some insight on some of them.
|
It's all done by https://github.com/SciML/RebuildAction |
Totally agree with this. Once the Metalhead.jl PR is resolved, I can write up a section on this in the docs for Flux.jl. |
Thanks for all the responses. It sounds like there are a handful of lists that could be put together generally on the topics of performance and best practices, and I think putting them in the Flux Docs probably makes sense. @darsnack , when you said "the custom dataset interface" it sounds like you were referring to something in particular? It also sounds like I need to familiarize myself with RebuildAction, CI, and the SciML Benchmarks, likely after this semester ends. |
Feel free to ask for help, and feel free to ping me to join the next ML Fast AI coordination call. I am getting Dhariya involved as well. |
I wrote a wonderful Documenter plugin DemoCards.jl for JuliaImages's demo page. It uses Literate.jl so you can write in plain julia files. It maps the folder structure into page structure so making changes are very flexible. |
I'm willing to put some effort in updating the model zoo. Already started testing the models, I'm keeping a list of issues here. |
So we already have the scripts folder which does the conversion, which is for this exact use case, also FluxBot.jl We need only go through the models to add literature. This is literate based as well, so it should be easy to test things out. cc @SophB |
Another thing that makes the model zoo less useful is that recently many models have this argument handling as part of every model which distracts from what the scripts are meant to be there for. Removing that and using simpler constructs that point to exactly what aspect is being talked about would make a whole lot of difference |
Yes, I noticed this while testing the models, some of them have taken a bit too much of a kitchen sink approach. IMHO the model zoo should find the right balance between three things:
My guess is that highlighting just one or maybe two features or ecosystem packages per model and using each of these features or ecosystem packages in only one or two models will result in a good balance. Overdoing it will only result in distraction from the great things that can be done with Flux, harder to grasp tutorials and longer running times for the benchmarks. Some examples:
|
I think if we use @ChrisRackauckas approach with SciML, we should be able to write Literate.jl scripts for all our model zoo examples. And I agree that some examples should throw out some of the kitchen sink and try to focus on the essence of what that example is trying to teach. Right now, the problem is that all the examples exist to be consumed in script form. Instead, if they were written in Literate.jl, we could have a Publish.jl website for the entire zoo. The pages of that site being the tutorials. I think just that change of writing for a different audience makes a difference in the produced result. |
We are working on establishing a "Flux" Publish theme for all the ecosystem packages. I can put together a sample PR where I translate a couple of the zoo examples and showcase the website. |
I have been saying for a while that if we can get the literature part in the model scripts, the conversion to be pushed to the site is trivial.
A PR would be welcome, have you seen the tutorial pages on the site already? |
They already are, check the Also check the dg/zygote branch for how the zoo used to look like, without the kitchen sink approach. |
I'm thinking of something like the tutorials page on the website, but it is automatically updated by CI on every "release" of the model zoo. That way every thing in the zoo is consumed as either a tutorial on the website, a runnable script you can download, or a script that the benchmarking CI can run. In the set up I was describing, the model zoo would have its own website that hosted all these tutorials. Is there a way to tie that to the Flux website? I don't think GH actions can trigger events on other repos? |
So the idea is exactly that, and we have a working example of that tied with FluxBot.jl, plus RebuildAction would allow for benchmarking. We are already setting up a benchmarking suite for GPU performance.
We should have that happen as part of the flux website, for sure. |
Could we add a bit in the tracker to move the site off Jekyll? Maybe publish.jl or pkgpage.jl or franklin.jl? |
Yeah moving off Jekyll to a Julia-based static website generator would make this all easier for sure. |
Can we replace any occurrence of |
I don't think that's a great idea, maybe for a model which specifically intends to show the loop or something that benefits from it.
I'm not sure what you mean here? There is ongoing and active discussion and best to have it properly |
I suggest the other way around, we give a single example of
It surely good to have the discussion, I'm just saying it is not converging and it could go on for months. My suggestion is to reverse things as currently stand, and primarily point users to the pattern that is more informative and more flexible, and only in the second instance to |
Let me rephrase this. The pattern that we have established with Flux.jl is what is represented in the examples, and we shouldn't change things only to change them back if we don't have a clear answer yet which will come with the convergence. |
there is no need to change anything back once you have custom loops |
If most of the examples use The for-loop will always be a first class API, so you can't go wrong by using custom loops everywhere. |
Not really, the for loop is under the same considerations and open to the same changes that any other api flux has had or will have. I prefer the for loop to train personally, but it isn't cogent to push it when it's not If the examples are better served to not need a for loop, then we did the right thing by teaching users to look for similar constructs in other packages that provide loops for more complex cases, and also educate them about how the loops work by having examples and docs that are meant to teach the for loop api. API design shouldn't prefer one set of assumptions for every case, but offer options. I don't want everyone copying the same for loop everywhere, the average case does not need any more complication. If there are average cases not well served, improve the api to catch that case. |
what I exactly want is for people to copy everywhere the for loop, they don't need the complication of |
I updated the main post to reflect @joostveenema's progress (correct if I got wrong). I don't think we need to halt on the JuliaML PRs, since the current available solutions w.r.t. data loading work well. There is no reason to halt meaningful progress on an important model-zoo refresh for a future data loading update (I don't think any of the API changes to JuliaML will surface in the model-zoo in a meaningful way anyways). I kept the Metalhead.jl PR in there, since I do think any pre-trained models in the zoo should use the future-facing version of Metalhead.jl. I plan on effectively automating the process of training and committing the pre-trained models today. I will update the Metalhead.jl PR when I am ready. |
If everyone has to copy the same piece of code around then the abstraction is wrong and we should change it. Maybe callbacks need to support more things, or there can be a few more switches. But telling everyone to roll it out by hand is only useful to exactly the same devs building the library.
Indeed, but flexibility will always be limiting. There will always be some optimizers that require less flexibility (BFGS, KrylovTrustRegion), and so fully promoting the most open choice in a function sense is also limiting in another sense. The path forward is to try and tame what can be done and capture what users do into a simplified API to then specialize and help the code perform better, not let it run too loose. Isn't that the point of Flux in the first place since you can just define the layers by hand? |
@darsnack I did start marking issues as stale but wasn't done yet. I'll spend some more time on it this week. |
I have added and updated README of all the vision model in FluxML/model-zoo#305 which fixes #218 |
I have made all the changes in FluxML/model-zoo#305 can anyone please review. |
This issue will track long term progress for the model-zoo. Here are the steps to get there (in order):
outdims
LearnBase.jlyet to be filed MLDataPattern.jlOriginal Text
Collecting these issues and PRs here for now. Eventually we may want to split some out for triage.
HWCN
order instead ofWHCN
recommended by document model-zoo#235 incorrect dimension order for some CNNsThe text was updated successfully, but these errors were encountered: