[WIP] adding ShaderEval tasks #97

Vipitis · 2023-06-19T23:26:59Z

Hey, first PR for me here:

adding ShaderEval task1, this is essentially just implementing the task as is in the EvaluationSuite: return completion.
the task is very much just meant as a "proof of concept" as there are several issues with it. I do plan on introducing more tasks to this benchmark soon and also make them generally better.
I do have several question too.

some differences that should not impact the results:

generation returns the prompt, so we remove it again in postprocess_generation
stop word is set as ";" - not the list of all tokens containing the semicolon (does the EndOfFunctionCriteria handle this?)
generation parameters can't be set so user has to set --do_sample False to use greedy search (temperature can't be set to 0?)

concerns I hope to address:

my term paper on the project isn't yet published so I got no reference to explain the tasks
my naming convention doesn't seem to fit
I did not add any documentation yet
fix the dataset revision to 0.0.2 for this task specifically
I had to comment out all import fcntl as that module does not exist on my home machine (Windows), so running tests wasn't possible
It's really slow on my home machine (as Intel GPU is not supported in accelerate on Windows), therefore I have only been able to run a few models with really short snippets. I got matching scores for gpt2, bigscience/bloom-560m and Vipitis/santacoder-finetuned-Shadertoys-fine when running just 10 samples. Additionally I did a single run with 300 samples (this snippet is used throughout the paper) and got matching numbers of 0.566
Run parameters were the following:

accelerate launch main.py \
  --model Vipitis/santacoder-finetuned-Shadertoys-fine \
  --tasks ShaderEval \
  --limit 10 \
  --do_sample False \ 
  --save_generations \
  --save_generations_path generations_py.json \
  --use_auth_token \
  --trust_remote_code

(not having the last two doesn't throw any error but still runs (even slower) and return erroneous outputs)

Vipitis · 2023-07-20T00:30:05Z

converted to draft as development of the next tasks has started on this branch. Will try to add the other tasks when they are ready. I don't plan to change task1 but might improve the implementation.

add task2

Vipitis · 2023-10-12T18:59:34Z

closed due to cleaning up a bunch of stuff.
I will open a new draft PR in a few weeks that will hopefully provide a better implementation as well as documentation.
Plan currently is to have the data and implementation finished by the end of this year.

Vipitis added 3 commits June 20, 2023 00:33

adding ShaderEval task1

0b14b64

updated naming

ef8cbf4

inital base for task2

86ab9fd

Vipitis marked this pull request as draft July 20, 2023 00:28

Vipitis added 2 commits August 15, 2023 02:21

task2 rework: generation working

733cded

Merge remote-tracking branch 'upstream/main' into shadereval_task2

cd2d9f9

Vipitis changed the title ~~[WIP] adding ShaderEval task1~~ [WIP] adding ShaderEval tasks Oct 12, 2023

Merge pull request #1 from Vipitis/shadereval_task2

67113db

add task2

Vipitis closed this Oct 12, 2023

Vipitis deleted the ShaderEval_task1 branch October 12, 2023 18:55

Vipitis mentioned this pull request Dec 16, 2023

[WIP] Shadereval tasks #173

Draft

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] adding ShaderEval tasks #97

[WIP] adding ShaderEval tasks #97

Vipitis commented Jun 19, 2023

Vipitis commented Jul 20, 2023

Vipitis commented Oct 12, 2023

[WIP] adding ShaderEval tasks #97

[WIP] adding ShaderEval tasks #97

Conversation

Vipitis commented Jun 19, 2023

Vipitis commented Jul 20, 2023

Vipitis commented Oct 12, 2023