Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding resource section to GPT-J docs #21270

Merged
merged 7 commits into from
Jan 30, 2023

Conversation

adit299
Copy link
Contributor

@adit299 adit299 commented Jan 23, 2023

What does this PR do?

Adds resources section to the GPT-J documents.

Fixes #20055 (issue)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@sgugger @stevhliu @MKhalusova

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jan 23, 2023

The documentation is not available anymore as the PR was closed or merged.

@adit299
Copy link
Contributor Author

adit299 commented Jan 26, 2023

Hello,

I have been currently working on finding resources for GPT-J, and mainly I have been using the links mentioned in #20055 and searching GPT-J in each of the links. I found a few links, but I feel this is not the best way to find the resources. Can you share some tips for how you were able to find more resources? @stevhliu

What I have so far:

GPT-J Description:

Blog Posts:

NielsRogge's Transformers Tutorials:

@stevhliu
Copy link
Member

Thanks for your work, that's a great start and I think you have most of them! You can also add:

  • This GPT-J notebook from Niels Transformers Tutorials for inference.
  • A chapter in the Hugging Face Course for causal language modeling.
  • The example scripts and notebooks for causal language modeling and text generation (see the last three bullet points under the Resource section here for GPT-2).

Copy link
Member

@stevhliu stevhliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks for adding these! Left a few comments on formatting and using the right GPT model, after which we should be good :)

docs/source/en/model_doc/gptj.mdx Show resolved Hide resolved
docs/source/en/model_doc/gptj.mdx Outdated Show resolved Hide resolved
- A blog post introducing GPT-J [GPT-J-6B: 6B JAX-Based Transformer](https://arankomatsuzaki.wordpress.com/2021/06/04/gpt-j/) 🌎
- A notebook for [GPT-J-6B Inference Demo](https://colab.research.google.com/github/kingoflolz/mesh-transformer-jax/blob/master/colab_demo.ipynb)
- [Causal language modeling](https://huggingface.co/course/en/chapter7/6?fw=pt#training-a-causal-language-model-from-scratch) chapter of the 🤗 Hugging Face Course.
- [`GPT2LMHeadModel`] is supported by this [causal language modeling example script](https://github.com/huggingface/transformers/tree/main/examples/pytorch/language-modeling#gpt-2gpt-and-causal-language-modeling), [text generation example script](https://github.com/huggingface/transformers/tree/main/examples/pytorch/text-generation), and [notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/language_modeling.ipynb).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace GPT2 with GPTJ, same for the TF and Flax implementations

Suggested change
- [`GPT2LMHeadModel`] is supported by this [causal language modeling example script](https://github.com/huggingface/transformers/tree/main/examples/pytorch/language-modeling#gpt-2gpt-and-causal-language-modeling), [text generation example script](https://github.com/huggingface/transformers/tree/main/examples/pytorch/text-generation), and [notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/language_modeling.ipynb).
- [`GPTJCausalLM`] is supported by this [causal language modeling example script](https://github.com/huggingface/transformers/tree/main/examples/pytorch/language-modeling#gpt-2gpt-and-causal-language-modeling), [text generation example script](https://github.com/huggingface/transformers/tree/main/examples/pytorch/text-generation), and [notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/language_modeling.ipynb).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the labels. However, looking at the links, they mention GPT-2 but GPT-J is not mentioned as being supported. For example the script mentioned here: https://github.com/huggingface/transformers/tree/main/examples/pytorch/language-modeling#gpt-2gpt-and-causal-language-modeling.

My questions are:

(1) I am a bit confused about what these links are. Are these just scripts that make the modeling process easier? Clarification on what these are would be appreciated.

(2) Are there any scripts present which support GPT-J? I had a look but couldn't find anything.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. These are example scripts for how to finetune a model for a certain task if you prefer to just run a script instead of a notebook.
  2. The script above for causal language modeling (CLM) should support GPT-J and all models in general with a CLM pretraining objective (so anything in the GPT family).

docs/source/en/model_doc/gptj.mdx Outdated Show resolved Hide resolved
@adit299
Copy link
Contributor Author

adit299 commented Jan 28, 2023

It looks like the formatting for the docs is still not correct..the bulletpoints are all jumbled up. Looking into this...

Copy link
Member

@stevhliu stevhliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last typo, and the formatting looks good on my end! 👍

docs/source/en/model_doc/gptj.mdx Outdated Show resolved Hide resolved
@adit299 adit299 marked this pull request as ready for review January 30, 2023 20:10
@adit299
Copy link
Contributor Author

adit299 commented Jan 30, 2023

I have marked the pull request as ready to review 👍 @stevhliu

@stevhliu stevhliu requested a review from sgugger January 30, 2023 21:30
Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding those!

@sgugger sgugger merged commit 914e500 into huggingface:main Jan 30, 2023
miyu386 pushed a commit to miyu386/transformers that referenced this pull request Feb 9, 2023
* Added resource section to GPT-J docs

* Added most of the links found

* Addressing review comments

* Fixing formatting

* Update docs/source/en/model_doc/gptj.mdx

Co-authored-by: Steven Liu <[email protected]>

* Fixing one of the labels

---------

Co-authored-by: Steven Liu <[email protected]>
ArthurZucker pushed a commit to ArthurZucker/transformers that referenced this pull request Mar 2, 2023
* Added resource section to GPT-J docs

* Added most of the links found

* Addressing review comments

* Fixing formatting

* Update docs/source/en/model_doc/gptj.mdx

Co-authored-by: Steven Liu <[email protected]>

* Fixing one of the labels

---------

Co-authored-by: Steven Liu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Model resources contribution
4 participants