Model resources contribution #20055

stevhliu · 2022-11-03T23:45:30Z

Hi friends! 👋

There are a lot of cool existing resources for how to do x with x model, and we’d like to showcase and aggregate these resources on a model’s documentation. This’ll help users see how they can get started with a model for their own tasks since we know a lot of users check out the model documentation first. Take a look at a completed resource section for DistilBERT as an example.

I’ve identified the top 20 models by pageviews, and now I’d like to open it up to the community if anyone is interested in helping!

Anyone can contribute; you just need to comment and claim one of the models on this list. Contributing is super easy:

Once you've claimed a model from the list, collect the existing resources from:

the Hugging Face blog
relevant materials from the 🤗 Hugging Face Course
the Hugging Face example scripts and notebooks
@NielsRogge's Transformers Tutorials repository
@philschmid's blog
notebooks from the community ❤️

Organize the resources by model tasks or applications (like inference or deployment):

Use the corresponding icons for each task (you can find the names for each icon here):
```
<PipelineTag pipeline=”name-of-task”/>
```
For certain categories, you can just do: 🚀 Deploy, ⚡️ Inference, or ⚗️ Optimization, etc.
For community resources, add the 🌎 emoji at the end to indicate it’s not an official Hugging Face resource.
Use this DistilBERT file as a template. You can copy and paste the intro text and just replace DistilBERT with the name of the model you're working on.

Open a Pull Request with the new resources for your chosen model and ping me for a review (if you’re just getting started with contributing to an open-source project, check out @merveenoyan's awesome GitHub Contribution Guide).
Congratulations, you just merged a PR into 🤗 Transformers, and your contribution will now help anyone who is looking at the model docs! 🎉

If you have any questions or need any help, don’t hesitate to ping me! 🤗❤️

The text was updated successfully, but these errors were encountered:

shogohida · 2022-11-04T17:23:30Z

Hi @stevhliu, I want to work on OpenAI GPT!

stevhliu · 2022-11-04T18:15:07Z

Awesome! I'm looking forward to your contribution, and feel free to ping me if you have any questions! 🤗

shogohida · 2022-11-06T09:02:50Z

@stevhliu
I have a question. Is there a good way to search GitHub and blog posts? I tried to find related repos and blog posts with the word OpenAI GPT but I couldn't find them because search function doesn't seem to work well... Should I search one by one repo or post?

I made a draft pull request although it doesn't have links of GitHub and blog. You can check it to see if my research has been good or not
#20084

stevhliu · 2022-11-07T19:13:53Z

Hey @shogohida, thanks for starting on this!

The easiest way I've found for searching the blog posts is to go to the blog repo and search for mentions of GPT inside the repo. Then you can take a look at the results and see what's relevant!

For GitHub materials, you only have to look at the example scripts, and notebooks and see what task your model can be applied to. For example, OpenAI GPT is a casual language model, so you can link to example scripts for causal language modeling and also text generation. You can link the equivalent scripts in TensorFlow and Flax if they're available.

After the scripts, you can hop over to the notebooks and see what task your model can be applied to (language modeling, generate text) and do the same thing for the community notebooks!

shogohida · 2022-11-08T15:13:20Z

@stevhliu
Thanks for your comment! It will take a lot of time to collect resources from scripts and notebooks because I'm not very familiar with OpenAI GPT but I'll do my best. I'll let you know if I have another question

ambujpawar · 2022-11-09T15:43:00Z

Hi, I would like to take CLIP from the list you have mentioned. :)

stevhliu · 2022-11-09T16:05:49Z

That's great @ambujpawar! I'm looking forward to your contribution, and feel free to ping me if you have any questions! 🤗

Saad135 · 2022-11-10T02:28:46Z

@stevhliu I would like to work on DeBERTa

stevhliu · 2022-11-10T16:09:32Z

Great, thanks for taking on DeBERTa @Saad135! 🤗

JuheonChu · 2022-11-28T05:26:50Z

Hello, do you mind if I can tackle on ALBERT model? @stevhliu

stevhliu · 2022-11-28T17:20:50Z

For sure, looking forward to your contribution @JuheonChu! 🤗

stanleycai95 · 2022-12-03T15:33:11Z

Hi! Could I try ViT? It might take me some time though as have some work projects to complete too.

hazrulakmal · 2022-12-03T15:51:24Z

Hi, I would like to work on XLM-RoBERTa! @stevhliu

stevhliu · 2022-12-05T18:22:14Z

Hey @stanleycai95, that would be great! Feel free to work on it when you have the time :)

Awesome, XLM-RoBERTa is all yours @hazrulakmal!

elabongaatuo · 2023-06-22T15:46:50Z

@stevhliu hello, @ENate can take it up. 😊

ENate · 2023-06-24T14:49:29Z

Okay then. Will proceed using the guidelines provided by @stevhliu and the example for DIstilBERT.

ENate · 2023-06-29T18:02:12Z

@stevhliu - I saw that there is a resource for ALBERT at:

https://huggingface.co/docs/transformers/main/en/model_doc/albert

which is similar to the resources for DistilBERT you mentioned in the guidelines above at:

https://huggingface.co/docs/transformers/main/en/model_doc/distilbert#resources

stevhliu · 2023-06-30T00:32:57Z

Yeah ALBERT only has the task guides, and it doesn't go quite as in-depth as DistilBERT. For example, DistilBERT includes links to the course, notebooks, and scripts. You can probably just copy over most of the content from DistilBERT that is relevant to ALBERT (in other words, replace DistilBERTForX with ALBERTForX)!

ENate · 2023-06-30T07:02:04Z

Thanks :) @stevhliu

daniela-basurto · 2023-07-10T17:42:15Z

Hello @stevhliu is Jukebox still available?

stevhliu · 2023-07-10T18:15:31Z

Feel free to open a PR for Jukebox @daniela-basurto! 🤗

wonhyeongseo · 2023-07-11T06:30:24Z

Hello @stevhliu may I please take up whisper with a few of the OSSCA mentees?

Cc: tysm @ArthurZucker for the pointer! We'll start compiling models with incomplete resource tabs so our mentees can work on them.

stevhliu · 2023-07-11T21:41:53Z

Yes absolutely, thanks for your interest @wonhyeongseo!

wonhyeongseo · 2023-07-21T08:12:21Z

ahtashamilyas · 2023-07-21T08:19:17Z

Ok.

stevhliu · 2023-07-21T15:25:44Z

I'm not sure if all of these are open for contributions though.

Thanks for checking @wonhyeongseo! I think it would be nice to eventually have Resources for all the models, so if you see other ones you're interested in contributing to, feel free to open a PR! I would focus on the more high-impact models first (like LLaMA) that get more pageviews/usage. For certain models (like BORT) that are in maintenance mode, we can skip those entirely.

wonhyeongseo · 2023-07-21T23:43:13Z

Awesome @stevhliu , thank you so much for your warm reception.

May we please reserve LLaMA as well for the OSSCA team?
In your opinion, when is the ideal time to start gathering resources after a model's release?
For LLaMA2, since it's relatively new, there might not be many official resources yet. It will depend on a models impact as you described, but a rule of thumb would be useful.
Although I think this is already the case, would it be possible for you to sort these incomplete models and provide the top 20 sorted by impact or page views as of recent advances?
I'm sure some will pique the interest of my team and our mentees.

Thank you for the heads up for files under maintenance! I've deleted 8 of those from the above #20055 (comment) list by grep -LZ "## Resources" * | xargs -0 grep -l "<Tip warning={true}>" :

auto.md
bort.md
mctct.md
open-llama.md
retribert.md
tapex.md
trajectory_transformer.md
transfo-xl.md

Thank you so much for your support @stevhliu .
Hope you have a wonderful weekend!

Best regards,
Won Seo

stevhliu · 2023-07-24T20:59:31Z

May we please reserve LLaMA as well for the OSSCA team?

For sure! 👍

In your opinion, when is the ideal time to start gathering resources after a model's release?

I think maybe whenever you see some content, you can open a PR to add it to the model page. It's ok if it's just one guide/tutorial/blog post; we can gradually add to it as more content and resources get created. For example, Philipp has a blog post about fine-tuning LLaMA 2 on SageMaker here that can be added :)

Although I think this is already the case, would it be possible for you to sort these incomplete models and provide the top 20 sorted by impact or page views as of recent advances?

By downloads, here are the next top 20 models (its okay to skip some of the models if there aren't any available resources for them):

BART
CLIPSeg
Marian
MPNet
ELECTRA
ResNet
CamemBERT
HuBERT
LLaMA
Longformer
VisionEncoderDecoder
GPT NeoX
EnCodec
ConvBERT
mBART
GPT Neo
FNet
YOLOS
BLIP
BEiT

ahtashamilyas · 2023-07-25T09:21:34Z

ok

ajaitly11 · 2023-08-03T19:19:03Z

@stevhliu
Hello, I would like to put together some resources for Longformer, willing to look into CamemBERT as well if Longformer has already been taken.

debrupf2946 · 2023-10-01T04:40:15Z

Hi @stevhliu can I work on LLaMA, can you please assign me?

stevhliu · 2023-10-02T16:31:52Z

Hi @debrupf2946, we already have a resource section for Llama. Feel free to work on another model if you're interested!

siddharth1012 · 2023-10-23T05:28:09Z

Hi @stevhliu Can I work on ALBERT, which is the only left model in the list?

ENate · 2023-10-23T07:40:58Z

Hi @siddharth1012 I am almost done with ALBERT and about to open a PR. Issue is due to a doc-builder issue I encountered. Also the problem on Cython version.

ENate · 2023-10-23T07:41:25Z

So I am working on ALBERT thanks

jaykhatri0875 · 2023-11-13T15:48:06Z

Hi, is there anything that i can help with? i can see lot of diff models are being talked above, where can find open list ?
thanks,

stevhliu · 2023-11-13T19:24:27Z

Hi, thanks for your interest @jaykhatri0875! The open model list is here but I think we're close to completing it. If you're interested in making other contributions, feel free to check out the Good First Issues 🤗

stevhliu · 2023-11-16T19:45:48Z

All finished now! 🥳

stevhliu added Help wanted Extra attention is needed, help appreciated Good First Documentation Issue labels Nov 3, 2022

shogohida mentioned this issue Nov 6, 2022

[Docs] Add resources of OpenAI GPT #20084

Merged

5 tasks

NielsRogge added the Good First Issue label Nov 7, 2022

Saad135 mentioned this issue Nov 10, 2022

Add to DeBERTa resources #20155

Merged

5 tasks

ambujpawar mentioned this issue Nov 13, 2022

Add clip resources to the transformers documentation #20190

Merged

4 tasks

sgugger closed this as completed in #20190 Nov 15, 2022

NielsRogge reopened this Nov 15, 2022

sgugger closed this as completed in #20084 Nov 16, 2022

stevhliu reopened this Nov 16, 2022

hazrulakmal mentioned this issue Dec 7, 2022

added model resources for xlm-roberta #20637

Closed

3 tasks

This was referenced Dec 8, 2022

Albert resource #20667

Closed

Added resources on albert model #20697

Closed

Added resources for albert architecture #20717

Closed

stanleycai95 mentioned this issue Dec 10, 2022

Add model resources for ViT #20723

Merged

5 tasks

hazrulakmal mentioned this issue Dec 12, 2022

Add docs xlm roberta #20742

Merged

wonhyeongseo mentioned this issue Aug 16, 2023

Add Llama2 resources #25531

Merged

5 tasks

eenzeenee mentioned this issue Aug 30, 2023

Add LLaMA resources #25859

Merged

5 tasks

junejae mentioned this issue Sep 10, 2023

docs: feat: add llama2 notebook resources from OSSCA community #26076

Merged

5 tasks

junejae mentioned this issue Sep 30, 2023

docs: feat: add clip notebook resources from OSSCA community #26505

Merged

5 tasks

eenzeenee mentioned this issue Oct 2, 2023

Add CLIP resources #26534

Merged

4 tasks

stevhliu closed this as completed Nov 16, 2023

Model resources contribution #20055

Model resources contribution #20055

Comments

stevhliu commented Nov 3, 2022

shogohida commented Nov 4, 2022

stevhliu commented Nov 4, 2022

shogohida commented Nov 6, 2022 • edited Loading

stevhliu commented Nov 7, 2022

shogohida commented Nov 8, 2022

ambujpawar commented Nov 9, 2022

stevhliu commented Nov 9, 2022

Saad135 commented Nov 10, 2022

stevhliu commented Nov 10, 2022

JuheonChu commented Nov 28, 2022 • edited Loading

stevhliu commented Nov 28, 2022

stanleycai95 commented Dec 3, 2022 • edited Loading

hazrulakmal commented Dec 3, 2022

stevhliu commented Dec 5, 2022

elabongaatuo commented Jun 22, 2023

ENate commented Jun 24, 2023

ENate commented Jun 29, 2023

stevhliu commented Jun 30, 2023

ENate commented Jun 30, 2023

daniela-basurto commented Jul 10, 2023

stevhliu commented Jul 10, 2023

wonhyeongseo commented Jul 11, 2023 • edited Loading

stevhliu commented Jul 11, 2023

wonhyeongseo commented Jul 21, 2023 • edited Loading

ahtashamilyas commented Jul 21, 2023

stevhliu commented Jul 21, 2023

wonhyeongseo commented Jul 21, 2023

stevhliu commented Jul 24, 2023

ahtashamilyas commented Jul 25, 2023

ajaitly11 commented Aug 3, 2023 • edited Loading

debrupf2946 commented Oct 1, 2023

stevhliu commented Oct 2, 2023

siddharth1012 commented Oct 23, 2023

ENate commented Oct 23, 2023

ENate commented Oct 23, 2023

jaykhatri0875 commented Nov 13, 2023

stevhliu commented Nov 13, 2023

stevhliu commented Nov 16, 2023

shogohida commented Nov 6, 2022 •

edited

Loading

JuheonChu commented Nov 28, 2022 •

edited

Loading

stanleycai95 commented Dec 3, 2022 •

edited

Loading

wonhyeongseo commented Jul 11, 2023 •

edited

Loading

wonhyeongseo commented Jul 21, 2023 •

edited

Loading

ajaitly11 commented Aug 3, 2023 •

edited

Loading