Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multi-threaded model initialization #737

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ngc92
Copy link
Contributor

@ngc92 ngc92 commented Aug 12, 2024

A simple demonstration of initializing from multiple threads.
Factored out the parallel part; by making it a separate function, it becomes less likely that we accidentally change shared state by using variables from the outer scope.

@@ -504,6 +504,47 @@ void gpt2_build_from_checkpoint(GPT2 *model, const char* checkpoint_path, bool w
cudaCheck(cudaDeviceSynchronize());
}

void gpt2_init_layer(GPT2 *model, int l, mt19937_state* rng, floatX* params) {
int offset = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be size_t for larger model sizes.

@ademeure
Copy link
Contributor

Looks good to me, it improves startup time for -e "d72" from ~100s to ~15s on a 1xH100 node with 26 CPU cores! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants