Excluding AdamWeightDecayOptimizer internal variables from restoring #16

donatasrep · 2018-11-13T15:13:18Z

I tried to use convert_tf_checkpoint_to_pytorch.py script to convert my pretrained model, but in order to do so, I had to make some minor tweaks. I thought I would share in case you find it useful.

thomwolf · 2018-11-13T15:15:30Z

Is your pre-trained model a TensorFlow model?

donatasrep · 2018-11-13T15:16:47Z

Yes

thomwolf · 2018-11-13T15:19:35Z

Nice, thanks for that!

Excluding AdamWeightDecayOptimizer internal variables from restoring

* update kd qa in roberta modeling * fix issues for kd-quac runner * add adversarial training support for roberta question answering * add adversarial training support for roberta question answering (cont.) * add adversarial training support for roberta question answering (cont.)

…module hack to make roberta can run it ortmodule

…update_hf_training Removed hardcoded warmup steps.

wed

Triton integration

* pkg depends update * remove fused attention/mlp * Remove triton v1 and cleanup unused fused files * cleanup unused args

* rename META_QUANTIZER_AUTOGPTQ to META_QUANTIZER_GTPQMODEL * fix typo

Excluding AdamWeightDecayOptimizer internal variables from restoring

20d07b3

thomwolf merged commit 5cd8d7a into huggingface:master Nov 13, 2018

qwang70 pushed a commit to DRL36/pytorch-pretrained-BERT that referenced this pull request Mar 2, 2019

Merge pull request huggingface#16 from donatasrep/master

39e0bab

Excluding AdamWeightDecayOptimizer internal variables from restoring

maeotaku mentioned this pull request May 23, 2019

bert->onnx ->caffe2 weird error #633

Closed

HongyanJiao mentioned this pull request Sep 19, 2019

traced_model #1291

Closed

manchandasahil mentioned this pull request Mar 22, 2021

Longformer training : CUDA error: device-side assert triggered #10852

Closed

2 tasks

amathews-amd referenced this pull request in ROCm/transformers Aug 6, 2021

Merge pull request #16 from microsoft/pr_for_running_roberta_with_ort…

d25a36f

…module hack to make roberta can run it ortmodule

rraminen pushed a commit to rraminen/transformers that referenced this pull request Oct 27, 2022

Merge pull request huggingface#16 from ROCmSoftwarePlatform/adabeyta_…

24b288f

…update_hf_training Removed hardcoded warmup steps.

jlamypoirier added a commit to jlamypoirier/transformers that referenced this pull request Apr 4, 2023

Misc additions (huggingface#16)

43570b6

jameshennessytempus pushed a commit to jameshennessytempus/transformers that referenced this pull request Jun 1, 2023

Merge pull request huggingface#16 from huggingface/main

7a17fbe

wed

lwmlyy mentioned this pull request Aug 15, 2023

add util for ram efficient loading of model when using fsdp #25107

Merged

1 task

ZYC-ModelCloud pushed a commit to ZYC-ModelCloud/transformers that referenced this pull request Nov 14, 2024

Merge pull request huggingface#16 from PanQiWei/triton_integration

8281547

Triton integration

ZYC-ModelCloud pushed a commit to ZYC-ModelCloud/transformers that referenced this pull request Nov 14, 2024

Cleanup (huggingface#16)

37a8cd3

* pkg depends update * remove fused attention/mlp * Remove triton v1 and cleanup unused fused files * cleanup unused args

ZYC-ModelCloud pushed a commit to ZYC-ModelCloud/transformers that referenced this pull request Nov 14, 2024

Rename tooling var (huggingface#16)

e7f99c3

* rename META_QUANTIZER_AUTOGPTQ to META_QUANTIZER_GTPQMODEL * fix typo

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Excluding AdamWeightDecayOptimizer internal variables from restoring #16

Excluding AdamWeightDecayOptimizer internal variables from restoring #16

donatasrep commented Nov 13, 2018

thomwolf commented Nov 13, 2018

donatasrep commented Nov 13, 2018

thomwolf commented Nov 13, 2018

Excluding AdamWeightDecayOptimizer internal variables from restoring #16

Excluding AdamWeightDecayOptimizer internal variables from restoring #16

Conversation

donatasrep commented Nov 13, 2018

thomwolf commented Nov 13, 2018

donatasrep commented Nov 13, 2018

thomwolf commented Nov 13, 2018