Skip to content

Commit

Permalink
Merge pull request huggingface#28 from huggingface/main
Browse files Browse the repository at this point in the history
ra
  • Loading branch information
jamesthesnake authored Apr 11, 2023
2 parents 9740634 + fe1f5a6 commit 1d5a2fe
Show file tree
Hide file tree
Showing 16 changed files with 93 additions and 26 deletions.
12 changes: 7 additions & 5 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -162,14 +162,16 @@ You'll need **[Python 3.7]((https://github.com/huggingface/transformers/blob/mai
it with `pip uninstall transformers` before reinstalling it in editable
mode with the `-e` flag.

Depending on your OS, you may need to install some external libraries as well if the `pip` installation fails.

For macOS, you will likely need [MeCab](https://taku910.github.io/mecab/) which can be installed from Homebrew:
Depending on your OS, and since the number of optional dependencies of Transformers is growing, you might get a
failure with this command. If that's the case make sure to install the Deep Learning framework you are working with
(PyTorch, TensorFlow and/or Flax) then do:
```bash
brew install mecab
pip install -e ".[quality]"
```
which should be enough for most use cases.
5. Develop the features on your branch.
As you work on your code, you should make sure the test suite
Expand Down
10 changes: 9 additions & 1 deletion docs/source/en/add_new_model.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -202,7 +202,15 @@ source .env/bin/activate
pip install -e ".[dev]"
```

and return to the parent directory
Depending on your OS, and since the number of optional dependencies of Transformers is growing, you might get a
failure with this command. If that's the case make sure to install the Deep Learning framework you are working with
(PyTorch, TensorFlow and/or Flax) then do:

```bash
pip install -e ".[quality]"
```

which should be enough for most use cases. You can then return to the parent directory

```bash
cd ..
Expand Down
7 changes: 7 additions & 0 deletions docs/source/en/add_tensorflow_model.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,13 @@ source .env/bin/activate
pip install -e ".[dev]"
```

Depending on your OS, and since the number of optional dependencies of Transformers is growing, you might get a
failure with this command. If that's the case make sure to install TensorFlow then do:

```bash
pip install -e ".[quality]"
```

**Note:** You don't need to have CUDA installed. Making the new model work on CPU is sufficient.

4. Create a branch with a descriptive name from your main branch
Expand Down
15 changes: 13 additions & 2 deletions docs/source/en/pr_checks.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ When you open a pull request on 🤗 Transformers, a fair number of checks will

In this document, we will take a stab at explaining what those various checks are and the reason behind them, as well as how to debug them locally if one of them fails on your PR.

Note that they all require you to have a dev install:
Note that, ideally, they require you to have a dev install:

```bash
pip install transformers[dev]
Expand All @@ -36,7 +36,18 @@ or for an editable install:
pip install -e .[dev]
```

inside the Transformers repo.
inside the Transformers repo. Since the number of optional dependencies of Transformers has grown a lot, it's possible you don't manage to get all of them. If the dev install fails, make sure to install the Deep Learning framework you are working with (PyTorch, TensorFlow and/or Flax) then do

```bash
pip install transformers[quality]
```

or for an editable install:

```bash
pip install -e .[quality]
```


## Tests

Expand Down
3 changes: 3 additions & 0 deletions examples/pytorch/question-answering/run_seq2seq_qa.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@

import datasets
import evaluate
import numpy as np
from datasets import load_dataset
from trainer_seq2seq_qa import QuestionAnsweringSeq2SeqTrainer

Expand Down Expand Up @@ -614,6 +615,8 @@ def post_processing_function(
preds = outputs.predictions
if isinstance(preds, tuple):
preds = preds[0]
# Replace -100s used for padding as we can't decode them
preds = np.where(preds != -100, preds, tokenizer.pad_token_id)
decoded_preds = tokenizer.batch_decode(preds, skip_special_tokens=True)

# Build a map example to its corresponding features.
Expand Down
10 changes: 6 additions & 4 deletions examples/pytorch/summarization/run_summarization.py
Original file line number Diff line number Diff line change
Expand Up @@ -632,10 +632,10 @@ def compute_metrics(eval_preds):
preds, labels = eval_preds
if isinstance(preds, tuple):
preds = preds[0]
# Replace -100s used for padding as we can't decode them
preds = np.where(preds != -100, preds, tokenizer.pad_token_id)
decoded_preds = tokenizer.batch_decode(preds, skip_special_tokens=True)
if data_args.ignore_pad_token_for_loss:
# Replace -100 in the labels as we can't decode them.
labels = np.where(labels != -100, labels, tokenizer.pad_token_id)
labels = np.where(labels != -100, labels, tokenizer.pad_token_id)
decoded_labels = tokenizer.batch_decode(labels, skip_special_tokens=True)

# Some simple post-processing
Expand Down Expand Up @@ -714,8 +714,10 @@ def compute_metrics(eval_preds):

if trainer.is_world_process_zero():
if training_args.predict_with_generate:
predictions = predict_results.predictions
predictions = np.where(predictions != -100, predictions, tokenizer.pad_token_id)
predictions = tokenizer.batch_decode(
predict_results.predictions, skip_special_tokens=True, clean_up_tokenization_spaces=True
predictions, skip_special_tokens=True, clean_up_tokenization_spaces=True
)
predictions = [pred.strip() for pred in predictions]
output_prediction_file = os.path.join(training_args.output_dir, "generated_predictions.txt")
Expand Down
10 changes: 6 additions & 4 deletions examples/pytorch/translation/run_translation.py
Original file line number Diff line number Diff line change
Expand Up @@ -543,10 +543,10 @@ def compute_metrics(eval_preds):
preds, labels = eval_preds
if isinstance(preds, tuple):
preds = preds[0]
# Replace -100s used for padding as we can't decode them
preds = np.where(preds != -100, preds, tokenizer.pad_token_id)
decoded_preds = tokenizer.batch_decode(preds, skip_special_tokens=True)
if data_args.ignore_pad_token_for_loss:
# Replace -100 in the labels as we can't decode them.
labels = np.where(labels != -100, labels, tokenizer.pad_token_id)
labels = np.where(labels != -100, labels, tokenizer.pad_token_id)
decoded_labels = tokenizer.batch_decode(labels, skip_special_tokens=True)

# Some simple post-processing
Expand Down Expand Up @@ -626,8 +626,10 @@ def compute_metrics(eval_preds):

if trainer.is_world_process_zero():
if training_args.predict_with_generate:
predictions = predict_results.predictions
predictions = np.where(predictions != -100, predictions, tokenizer.pad_token_id)
predictions = tokenizer.batch_decode(
predict_results.predictions, skip_special_tokens=True, clean_up_tokenization_spaces=True
predictions, skip_special_tokens=True, clean_up_tokenization_spaces=True
)
predictions = [pred.strip() for pred in predictions]
output_prediction_file = os.path.join(training_args.output_dir, "generated_predictions.txt")
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -682,6 +682,9 @@ def forward(

lm_loss = None
if labels is not None:
# move labels to correct device to enable model parallelism
labels = labels.to(lm_logits.device)

# we are doing next-token prediction; shift prediction scores and input ids by one
shift_logits = lm_logits[:, :-1, :].contiguous()
labels = labels[:, 1:].contiguous()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1236,6 +1236,9 @@ def forward(
router_probs = None
aux_loss = None
if labels is not None:
# move labels to correct device to enable model parallelism
labels = labels.to(lm_logits.device)

loss_fct = nn.CrossEntropyLoss(ignore_index=-100)

if output_router_logits:
Expand Down
4 changes: 2 additions & 2 deletions src/transformers/onnx/features.py
Original file line number Diff line number Diff line change
Expand Up @@ -267,7 +267,7 @@ class FeaturesManager:
onnx_config_cls="models.deberta_v2.DebertaV2OnnxConfig",
),
"deit": supported_features_mapping(
"default", "image-classification", "masked-im", onnx_config_cls="models.deit.DeiTOnnxConfig"
"default", "image-classification", onnx_config_cls="models.deit.DeiTOnnxConfig"
),
"detr": supported_features_mapping(
"default",
Expand Down Expand Up @@ -515,7 +515,7 @@ class FeaturesManager:
"vision2seq-lm", onnx_config_cls="models.vision_encoder_decoder.VisionEncoderDecoderOnnxConfig"
),
"vit": supported_features_mapping(
"default", "image-classification", "masked-im", onnx_config_cls="models.vit.ViTOnnxConfig"
"default", "image-classification", onnx_config_cls="models.vit.ViTOnnxConfig"
),
"whisper": supported_features_mapping(
"default",
Expand Down
3 changes: 2 additions & 1 deletion src/transformers/pipelines/token_classification.py
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,8 @@ class AggregationStrategy(ExplicitEnum):
same entity together in the predictions or not.
stride (`int`, *optional*):
If stride is provided, the pipeline is applied on all the text. The text is split into chunks of size
model_max_length. Works only with fast tokenizers and `aggregation_strategy` different from `NONE`.
model_max_length. Works only with fast tokenizers and `aggregation_strategy` different from `NONE`. The
value of this argument defines the number of overlapping tokens between chunks.
aggregation_strategy (`str`, *optional*, defaults to `"none"`):
The strategy to fuse (or not) tokens based on the model prediction.
Expand Down
16 changes: 16 additions & 0 deletions templates/adding_a_new_model/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,14 @@ cd transformers
pip install -e ".[dev]"
```

Depending on your OS, and since the number of optional dependencies of Transformers is growing, you might get a
failure with this command. If that's the case make sure to install the Deep Learning framework you are working with
(PyTorch, TensorFlow and/or Flax) then do:

```bash
pip install -e ".[quality]"
```

Once the installation is done, you can use the CLI command `add-new-model` to generate your models:

```shell script
Expand Down Expand Up @@ -133,6 +141,14 @@ cd transformers
pip install -e ".[dev]"
```

Depending on your OS, and since the number of optional dependencies of Transformers is growing, you might get a
failure with this command. If that's the case make sure to install the Deep Learning framework you are working with
(PyTorch, TensorFlow and/or Flax) then do:

```bash
pip install -e ".[quality]"
```

Once the installation is done, you can use the CLI command `add-new-model-like` to generate your models:

```shell script
Expand Down
6 changes: 3 additions & 3 deletions tests/deepspeed/test_deepspeed.py
Original file line number Diff line number Diff line change
Expand Up @@ -869,8 +869,8 @@ class TestDeepSpeedWithLauncher(TestCasePlus):
# 2. most tests should probably be run on both: zero2 and zero3 configs
#

@require_torch_multi_gpu
@parameterized.expand(params, name_func=parameterized_custom_name_func)
@require_torch_multi_gpu
def test_basic_distributed(self, stage, dtype):
self.run_and_check(stage=stage, dtype=dtype, distributed=True)

Expand Down Expand Up @@ -900,8 +900,8 @@ def test_fp32_non_distributed(self, stage, dtype):
fp32=True,
)

@require_torch_multi_gpu
@parameterized.expand(params, name_func=parameterized_custom_name_func)
@require_torch_multi_gpu
def test_fp32_distributed(self, stage, dtype):
# real model needs too much GPU memory under stage2+fp32, so using tiny random model here -
# therefore no quality checks, just basic completion checks are done
Expand Down Expand Up @@ -941,8 +941,8 @@ def test_resume_train_not_from_ds_checkpoint(self, stage, dtype):

self.do_checks(output_dir, do_train=do_train, do_eval=do_eval)

@require_torch_multi_gpu
@parameterized.expand(["bf16", "fp16", "fp32"])
@require_torch_multi_gpu
def test_inference(self, dtype):
if dtype == "bf16" and not is_torch_bf16_gpu_available():
self.skipTest("test requires bfloat16 hardware support")
Expand Down
2 changes: 1 addition & 1 deletion tests/models/markuplm/test_modeling_markuplm.py
Original file line number Diff line number Diff line change
Expand Up @@ -378,7 +378,7 @@ def test_forward_pass_no_head(self):
self.assertEqual(outputs.last_hidden_state.shape, expected_shape)

expected_slice = torch.tensor(
[[0.0267, -0.1289, 0.4930], [-0.2376, -0.0342, 0.2381], [-0.0329, -0.3785, 0.0263]]
[[0.0675, -0.0052, 0.5001], [-0.2281, 0.0802, 0.2192], [-0.0583, -0.3311, 0.1185]]
).to(torch_device)

self.assertTrue(torch.allclose(outputs.last_hidden_state[0, :3, :3], expected_slice, atol=1e-4))
Loading

0 comments on commit 1d5a2fe

Please sign in to comment.