Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to convert huggingface model to megatron-deepspeed? #329

Closed
yayaQAQ opened this issue Aug 12, 2022 · 8 comments
Closed

how to convert huggingface model to megatron-deepspeed? #329

yayaQAQ opened this issue Aug 12, 2022 · 8 comments

Comments

@yayaQAQ
Copy link

yayaQAQ commented Aug 12, 2022

as title said.

@mayank31398
Copy link
Collaborator

This is not possible.
To download DS checkpoints refer to this issue:
#319

@yayaQAQ
Copy link
Author

yayaQAQ commented Aug 13, 2022

This is not possible. To download DS checkpoints refer to this issue: #319

why? So I have to training from the ground up?
It's hard.

@mayank31398
Copy link
Collaborator

I don't understand the issue. Do you just need to run inference?
If that is the case, DS-inference is compatible with all Huggingface models.

@AnShengqiang
Copy link

I don't understand the issue. Do you just need to run inference? If that is the case, DS-inference is compatible with all Huggingface models.

Hello, I have the same problem.

I want to load the model on Huggingface as a pre-training model weight and continue the training using the Megatron Deepspeed framework.

But I found that I didn't know how to convert the weight of Huggingface into the weight of Megatron Deepspeed.

I look forward to your help. Thank you.

@AnShengqiang
Copy link

By the way:

model structure: gpt
model link: https://huggingface.co/TsinghuaAI/CPM-Generate

I want to train the model with 4 pipeline parallel and deepspeed.

@mayank31398
Copy link
Collaborator

@AnShengqiang Its non-trivial to convert models for training.
People are actively exploring this as far as I know.
This repository saves something called a universal checkpoint which can be converted to other checkpoints.
However, I am quite new here so, I don't really know how that works.

@AnShengqiang
Copy link

@AnShengqiang Its non-trivial to convert models for training. People are actively exploring this as far as I know. This repository saves something called a universal checkpoint which can be converted to other checkpoints. However, I am quite new here so, I don't really know how that works.

Thank you for your reply, I will go to find the answer, if there is good news, I will put it here.

@yayaQAQ yayaQAQ closed this as completed Jan 14, 2023
@stgzr
Copy link

stgzr commented Mar 1, 2023

I don't understand the issue. Do you just need to run inference? If that is the case, DS-inference is compatible with all Huggingface models.

Hello, I have the same problem.

I want to load the model on Huggingface as a pre-training model weight and continue the training using the Megatron Deepspeed framework.

But I found that I didn't know how to convert the weight of Huggingface into the weight of Megatron Deepspeed.

I look forward to your help. Thank you.

Same problem. Any tools can do this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants