How can I load the model after pre-training? #55

MouadGhouti · 2024-07-10T12:05:00Z

MouadGhouti
Jul 10, 2024

I know this might seem a bit intuitive to some of you guys, but this is genuinely my first time to ever mess with LLMs. I'm currently training a model and I was wondering how can I use the model for inference after training? I have PyTorch checkpoint files, however I have no idea how can I load the model when it is done. Also, how can I upload the model to huggingface to make it available for other people?

huiqiao · 2024-07-14T03:44:15Z

huiqiao
Jul 14, 2024

GPTConfig, GPT are defined in train_gpt2.py, I put them in a separate file gpt_class.py

import torch
from torch.nn import functional as F
from gpt_class import GPTConfig, GPT

device = "cpu"
if torch.cuda.is_available():
device = "cuda"

state_dict = torch.load('log/model_19072.pt')
config = state_dict['config']
model = GPT(config)
model.to(device)
model.load_state_dict(state_dict['model'])

model.eval()
torch.manual_seed(42)
torch.cuda.manual_seed(42)

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How can I load the model after pre-training? #55

{{title}}

Replies: 1 comment

{{title}}

Select a reply

How can I load the model after pre-training? #55

MouadGhouti Jul 10, 2024

Replies: 1 comment

huiqiao Jul 14, 2024

GPTConfig, GPT are defined in train_gpt2.py, I put them in a separate file gpt_class.py

MouadGhouti
Jul 10, 2024

huiqiao
Jul 14, 2024