-
Notifications
You must be signed in to change notification settings - Fork 27.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add missing type hints #16059
Add missing type hints #16059
Comments
I would love to work on PyTorch |
Hi, I would like to work on PyTorch |
Hi, I would like to work on I will take a look at Edit: Because CamemBert depends on |
Hello! I'd like to take Cheers! |
I'll try PyTorch BERT to start! |
@johnryan465 I just did it as an example, I'm sorry! I'm marking off the completed models now. |
@Rocketknight1 no worries, will try and do DistillBert instead |
I'd like to work on GPT2 (TF). |
@Rocketknight1 I switch to Roberta PyTorch because CamemBERT depends on Roberta modeling |
Awesome! Hey @Rocketknight1 – I'd like to work on Longformer for both PyTorch & TF! |
I'd like to work on BigBird |
I would like to work on Clip for pytorch. |
Also, will work on |
I can work on ImageGPT. |
I can work on Swin (Pytorch) |
I'd like to work on XLM (Tensorflow) |
I'll take T5 (Tensorflow)! |
I'd like to claim GPT-2 (PyTorch). |
Hi @Rocketknight1, I would like to work on BART of both TF and PyTorch |
XLMRobertaXL (PyTorch) |
@nablabits If the Colab doc says type hints are missing, then at least some are still missing, so you can totally take it! (People often forget the return type, and the Colab will still mark a model as missing type hints if it isn't there) |
@Rocketknight1, ahh yeah, thanks for the reminder, I've seen that in the comments before, I will pay extra attention to that when opening the PR |
Hi @Rocketknight1 I've just pushed above PR ☝️, hope I'm not missing anything 🤞 Chances are this has been suggested/tried in the past and there's a good reason to not do it, but just in case, probably we can improve the notebook with this bit that not only will check for the import transformers
from typing import get_type_hints
from inspect import getfullargspec
for obj in dir(transformers):
try:
model = getattr(transformers, obj)
if issubclass(model, transformers.PreTrainedModel):
actual_hints = set(get_type_hints(model.forward))
expected_hints = set(getfullargspec(model.forward).args)
expected_hints.remove('self') # self does not carry type hints
expected_hints.add('return') # we need a type hint also for the output
missing_hints = expected_hints - actual_hints
if missing_hints:
print(f"{obj}: {missing_hints}")
except:
pass Running above, yields that for instance |
Hey @Rocketknight1 I will continue with |
@sgugger I'd say that this might have been closed automatically but it shouldn't have. Probably there's some automation I don't know about that closes issues whenever some conditions are met, in that case lmk if I need to change anything whenever I open a PR. |
@nablabits That happens when you link an issue in the PR - Github assumes the PR resolves the issue! Also, your suggestion for the notebook is good - I'll update it today! |
@nablabits notebook updated based on your suggestion, thank you! |
Hey Matt, that makes sense, probably it was picking the
Brilliant, always happy to be useful 🤗 I'm keen to further reduce that list so I will pick some of the first entries:
These ones have not been picked yet:
Is it ok to open a PR with all of them? |
Hey Matt, I will pick now these guys, let's crush that list 📋:
|
Hey Matt, I'm working now on the next 6 items:
|
Hi Matt, at this point I'm keen to finish all the remaining type hints for the pytorch models (17). I will open a couple of PRs so the review won't be a pain (unless you tell me otherwise) |
So these ☝️ are the last ones for the pytorch models. By looking at the git history I appreciate that you are pretty busy these days so no rush on this. Here you have a quick recap on what we have so far:
In the meantime I will start getting an intuition of how the TF ones work, I can't say that I'm going to address all of them (as there are a lot) but let's see how far we can get 🐌 |
It turns out that there are not that many after all, because most of them are ultimately subclassing from from inspect import getfullargspec, isclass, isfunction
import transformers
from typing import get_type_hints
def is_call_overridden(cls):
return "call" in cls.__dict__ and isfunction(cls.__dict__["call"])
def compute_missing_hints_tf(model):
if isclass(model) and issubclass(model, transformers.TFPreTrainedModel) and is_call_overridden(model):
actual_hints = set(get_type_hints(model.call))
expected_hints = set(getfullargspec(model.call).args)
expected_hints.remove("self") # self does not carry type hints
expected_hints.add("return") # we need a type hint also for the output
missing_hints = expected_hints - actual_hints
if missing_hints:
print(f"{obj}: {missing_hints}") This yields a much narrower list, so if these assumptions are right, with another PR we might be done with the issue. Matt, let me know what you think. |
@nablabits Ah, of course! I should have realized that the base |
This project is now officially complete! Thank you to everyone in this thread, and to other people who filed PRs, and congratulations to @nablabits who filed the final PR to finish it all! |
Are we any closer to reverting #18485? |
This issue is part of our Great Code Cleanup 2022. If you're interested in helping out, take a look at this thread, or come join us on Discord and talk with other contributors!
🚀 Add missing type hints
Type hints are used inconsistently in the
transformers
repo across both TF and PT models, and it'd be nice to make them a complete, consistent thing for the core models, especially because we want to develop features that depend on them!Guide to contributing:
make fixup
at the end to do a code quality check before your final commit!Tips for making your PR
src/transformers/models/[model_name]/
modeling_tf_[model_name].py
file. For PyTorch, you want themodeling_[model_name].py
file.call
(for TF) orforward
(for PT) method for user-facing classes likeTFRobertaForMaskedLM
orRobertaForSequenceClassification
. It's not necessary to add type hints to layers or base classes likeRobertaModel
orTFRobertaPreTrainedModel
- these are trickier to write, and generally people do not use those classes as standalone models.Optional[Union[np.ndarray, tf.Tensor]]
for TF models andOptional[torch.Tensor]
for PyTorch models, and boolean inputs areOptional[bool]
. Pay attention to the first input of TF models, though, which is usuallyTFModelInputType
- this is because Keras handles that first input in a special way! Other inputs to pay attention to arepast_key_values
, which can vary between models, and also the model output type. For the base model classes likeRobertaModel
, you may have to look at the correspondingMainLayer
to figure out the right output type! Also, note that the output type may be a tuple ifreturn_dict
is False, in which case you should specifyUnion[Tuple, ...]
. Finally, note that in TF models,training
is neverNone
, so it should betraining: bool
and nottraining: Optional[bool]
.# Copied from transformers.models.bert...
, this means that the code is copied from that source, and our scripts will automatically keep that in sync. If you see that, you should not edit the copied method! Instead, edit the original method it's copied from, and runmake fixup
to synchronize that across all the copies. Be sure you installed the development dependencies withpip install -e ".[dev"]
, as described in the contributor guidelines above, to ensure that the code quality tools inmake fixup
can run.How can I find models that need type hints?
I used to maintain a list here, but it got out of date, I'm sorry. Instead, you can use this Colab notebook. If you run this, it will show you models in PyTorch or TF that are still missing type hints. Unlike my manually curated lists, it's guaranteed to be up to date - but do double-check that someone else in the thread hasn't claimed a model before you start, because the Colab code will only register type hints after the PR containing them is merged!
The text was updated successfully, but these errors were encountered: