You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Not sure if this is a bug or perhaps a misimplementation, but i'm comparing the results of using gpt2-large on the 'Write With Transformers' Text Generation example - https://transformer.huggingface.co/doc/gpt2-large, vs. my own implementation of the text generation tool. My use-case is to generate distractor options in an MCQ environment, given a few (15-20) prior examples for the style of distractor questions to generate.
The general format I am implementing is: Question: _______ . Answer 1: ___<the correct answer>___ . Answer 2: _____<distractor 1>_____ . Answer 3: ____<distractor 2>_____
Model size - gpt2/large
Top-p - 0.9
Temperature - 1
Max time - 2.3
Output
On the 'Write With Transformers' page, I write in the examples:
eg. Question: How is calorie related to the S.I unit of that quantity?Answer 1: 1 cal = 4.2 J.Answer 2: 1 cal = 3.2 J.Answer 3: 1 cal = 10 J.
and when I try to generate predictions for the following Question-Answer pair: Question: How is the unit horse power related to the S.I. unit of power? Answer 1: 1 H.P. = 750 W.
am able to generate a few solid distractors: Question: How is the unit horse power related to the S.I. unit of power? Answer 1: 1 H.P. = 750 W. Answer 2: 1 H. P. = 1. 3 W. Answer 3 : 1 H . P. = 1. 5 W.
Custom Case
Output
Here's how I create a dataset from the set of questions I've initially written myself.
def make_dataset(dataset, epochs):
total_text = '<|endoftext|>'
qa = [t for t in dataset]
for _ in range(epochs):
random.shuffle(qa)
total_text += '<|endoftext|>'.join(qa) + '<|endoftext|>'
return total_text
for start in SENTENCES:
val = !python run_generation.py \
--model_type gpt2 \
--model_name_or_path output/$handle \
--length 40 \
--num_return_sequences $num_return_sequences \
--temperature 0.23 \
--p 0.95 \
--seed $seed \
--prompt {'"<|endoftext|>' + start + '"'}
generated = [val[-1-2*k] for k in range(num_return_sequences)[::-1]]
print(f'\nStart of sentence: {start}')
for i, g in enumerate(generated):
g = g.replace('<|endoftext|>', '')
print(f'* Generated #{i+1}: {g}')
These are my generated mcq pairs:
Generated #1: Question: How is the unit horse power related to the S.I. unit of power? Answer 1: 1 H.P. = 750 W. Answer 2: กราท.Answer 3: 50 J. Generated #2: Question: How is the unit horse power related to the S.I. unit of power? Answer 1: 1 H.P. = 750 W. Answer 2: กราท.Answer 3: 50 J. Generated #3: Question: How is the unit horse power related to the S.I. unit of power? Answer 1: 1 H.P. = 750 W. Answer 2: ______________ = 905 J.Answer 3: 50 J. Generated #4: Question: How is the unit horse power related to the S.I. unit of power? Answer 1: 1 H.P. = 750 W. Answer 2: กราท.Answer 3: 50 J. Generated #5: Question: How is the unit horse power related to the S.I. unit of power? Answer 1: 1 H.P. = 750 W. Answer 2: กราท.Answer 3: 50 J.
The text was updated successfully, but these errors were encountered:
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Hi,
Not sure if this is a bug or perhaps a misimplementation, but i'm comparing the results of using gpt2-large on the 'Write With Transformers' Text Generation example - https://transformer.huggingface.co/doc/gpt2-large, vs. my own implementation of the text generation tool. My use-case is to generate distractor options in an MCQ environment, given a few (15-20) prior examples for the style of distractor questions to generate.
The general format I am implementing is:
Question: _______ . Answer 1: ___<the correct answer>___ . Answer 2: _____<distractor 1>_____ . Answer 3: ____<distractor 2>_____
Write With Transformers
full doc available here - https://transformer.huggingface.co/share/CZqVXdngic
Model Config
Model size - gpt2/large
Top-p - 0.9
Temperature - 1
Max time - 2.3
Output
On the 'Write With Transformers' page, I write in the examples:
eg.
Question: How is calorie related to the S.I unit of that quantity?Answer 1: 1 cal = 4.2 J.Answer 2: 1 cal = 3.2 J.Answer 3: 1 cal = 10 J.
and when I try to generate predictions for the following Question-Answer pair:
Question: How is the unit horse power related to the S.I. unit of power? Answer 1: 1 H.P. = 750 W.
am able to generate a few solid distractors:
Question: How is the unit horse power related to the S.I. unit of power? Answer 1: 1 H.P. = 750 W. Answer 2: 1 H. P. = 1. 3 W. Answer 3 : 1 H . P. = 1. 5 W.
Custom Case
Output
Here's how I create a dataset from the set of questions I've initially written myself.
This is the training model params:
num_return_sequences = 5
These are my generated mcq pairs:
Generated #1: Question: How is the unit horse power related to the S.I. unit of power? Answer 1: 1 H.P. = 750 W. Answer 2: กราท.Answer 3: 50 J.
Generated #2: Question: How is the unit horse power related to the S.I. unit of power? Answer 1: 1 H.P. = 750 W. Answer 2: กราท.Answer 3: 50 J.
Generated #3: Question: How is the unit horse power related to the S.I. unit of power? Answer 1: 1 H.P. = 750 W. Answer 2: ______________ = 905 J.Answer 3: 50 J.
Generated #4: Question: How is the unit horse power related to the S.I. unit of power? Answer 1: 1 H.P. = 750 W. Answer 2: กราท.Answer 3: 50 J.
Generated #5: Question: How is the unit horse power related to the S.I. unit of power? Answer 1: 1 H.P. = 750 W. Answer 2: กราท.Answer 3: 50 J.
The text was updated successfully, but these errors were encountered: