Megatron-DeepSpeed/examples at main · savitamittal1/Megatron-DeepSpeed

History

Name		Name	Last commit message	Last commit date
parent directory ..
MoE		MoE
azure		azure
azureml		azureml
bert_with_pile		bert_with_pile
compression		compression
curriculum_learning		curriculum_learning
README.md		README.md
create_embeddings.sh		create_embeddings.sh
evaluate_ict_zeroshot_nq.sh		evaluate_ict_zeroshot_nq.sh
evaluate_zeroshot_gpt.sh		evaluate_zeroshot_gpt.sh
finetune_mnli_distributed.sh		finetune_mnli_distributed.sh
finetune_race_distributed.sh		finetune_race_distributed.sh
generate_text.sh		generate_text.sh
merge_mp_bert.sh		merge_mp_bert.sh
pretrain_bert.sh		pretrain_bert.sh
pretrain_bert_distributed.sh		pretrain_bert_distributed.sh
pretrain_bert_distributed_with_mp.sh		pretrain_bert_distributed_with_mp.sh
pretrain_gpt.sh		pretrain_gpt.sh
pretrain_gpt3_175B.sh		pretrain_gpt3_175B.sh
pretrain_gpt_distributed.sh		pretrain_gpt_distributed.sh
pretrain_gpt_distributed_with_mp.sh		pretrain_gpt_distributed_with_mp.sh
pretrain_ict.sh		pretrain_ict.sh
pretrain_t5.sh		pretrain_t5.sh
pretrain_t5_distributed.sh		pretrain_t5_distributed.sh
pretrain_t5_distributed_with_mp.sh		pretrain_t5_distributed_with_mp.sh
run_deepspeed_example.sh		run_deepspeed_example.sh

README.md

Recipes and Scripts

Please note that some of the script examples (e.g., pretrain_*.sh directly under Megatron-DeepSpeed/examples/ folder) are from the original NVIDIA's Megatron-LM and does not have DeepSpeed integration (scripts with DeepSpeed integration should include the deepspeed keyword). Below we list various examples that do have DeepSpeed integration.

Azure

We strongly recommend to start with AzureML recipe in the azureml folder.

If you have a custom infrastructure (e.g. HPC clusters) or Azure VM and VMSS based environments, please refer to the bash scripts in the azure folder.

MoE

Please see the MoE folder for different training recipes and scripts for Mixture-of-expert based models and dense models. These recipes are for GPT-style NLG models.

Curriculum Learning

Curriculum learning recipes are in the curriculum_learning folder. Please refer to the detailed tutorials linked inside. These recipes are for GPT-style NLG models.

Model Compression

The compression folder includes examples about layer reduction for task-agnostic compression. Please refer to this tutorial about the DeepSpeed Model Compression Library. These recipes are for GPT-style NLG models.

BERT example

The bert_with_pile folder includes examples about BERT-style model pre-training (using the public Pile data or user's own data) with DeepSpeed integration. Please refer to the readme in the folder for tutorial.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

examples

examples

README.md

Recipes and Scripts

Azure

MoE

Curriculum Learning

Model Compression

BERT example

Files

examples

Directory actions

More options

Directory actions

More options

Latest commit

History

examples

Folders and files

parent directory

README.md

Recipes and Scripts

Azure

MoE

Curriculum Learning

Model Compression

BERT example