-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add gpt-j-6B w/ deepspeed example #87
Conversation
Preliminary results:
Getting an error I've not yet encountered at the last request in above, which does not finish:
|
request_params["BENCHMARK_SEQUENCE_LENGTH"] < request_params["MAX_LENGTH"] | ||
) | ||
|
||
sequence_start = random.randrange( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can iterate over a these ranges:
self.input_sizes = [8, 512, 1536]
self.context_sizes = [16, 1024, 2048]
You would have to run it in a loop where you get the 25,50,75 percentile for 50 experiments with a cut off of 10 to start measurement.
sequence_end = sequence_start + request_params["BENCHMARK_SEQUENCE_LENGTH"] | ||
random_sequence_encoded = self.dataset[sequence_start:sequence_end] | ||
random_sequence = self.tokenizer.decode(random_sequence_encoded) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain how you are randomizing the sequences here? Does randrange
require two args: https://www.w3schools.com/python/ref_random_randrange.asp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
randrange()
can take 1-3 args. 1 arg just returns a random value between 0 and your arg N.
We pick a random start that is at most within BENCHMARK_SEQUENCE_LENGTH
of the end of the self.dataset
, so that when we add BENCHMARK_SEQUENCE_LENGTH
to the start, we don't go out of range.
Then we grab the slice of self.dataset
from start
to end
indexes.
The 3 arg form of randrange()
could also be used, but being limited to incrementing by the step
arg would limit the number of random sequences we could get.
return { | ||
"benchmark_results": { | ||
"input_sequence_length": request_params["BENCHMARK_SEQUENCE_LENGTH"], | ||
"generated_tokens": request_params["MAX_LENGTH"] | ||
- request_params["BENCHMARK_SEQUENCE_LENGTH"], | ||
"time": generation_time, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think iterating over the sequences will change this. You can write to a JSON (append) and read at the end to get the quantiles for the different number of experiments you run.
Closing pending Deepspeed updates/improvements:
Related GH Issues: I'm not sure where to go from here and don't want to spend anymore time on this without getting some more feedback. |
Fixes https://github.com/coreweave/infra-pm/issues/249