Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] adding ShaderEval tasks #97

Closed
wants to merge 6 commits into from

Conversation

Vipitis
Copy link

@Vipitis Vipitis commented Jun 19, 2023

Hey, first PR for me here:

adding ShaderEval task1, this is essentially just implementing the task as is in the EvaluationSuite: return completion.
the task is very much just meant as a "proof of concept" as there are several issues with it. I do plan on introducing more tasks to this benchmark soon and also make them generally better.
I do have several question too.

some differences that should not impact the results:

  • generation returns the prompt, so we remove it again in postprocess_generation
  • stop word is set as ";" - not the list of all tokens containing the semicolon (does the EndOfFunctionCriteria handle this?)
  • generation parameters can't be set so user has to set --do_sample False to use greedy search (temperature can't be set to 0?)

concerns I hope to address:

  • my term paper on the project isn't yet published so I got no reference to explain the tasks
  • my naming convention doesn't seem to fit
  • I did not add any documentation yet
  • fix the dataset revision to 0.0.2 for this task specifically
  • I had to comment out all import fcntl as that module does not exist on my home machine (Windows), so running tests wasn't possible
  • It's really slow on my home machine (as Intel GPU is not supported in accelerate on Windows), therefore I have only been able to run a few models with really short snippets. I got matching scores for gpt2, bigscience/bloom-560m and Vipitis/santacoder-finetuned-Shadertoys-fine when running just 10 samples. Additionally I did a single run with 300 samples (this snippet is used throughout the paper) and got matching numbers of 0.566
    Run parameters were the following:
accelerate launch main.py \
  --model Vipitis/santacoder-finetuned-Shadertoys-fine \
  --tasks ShaderEval \
  --limit 10 \
  --do_sample False \ 
  --save_generations \
  --save_generations_path generations_py.json \
  --use_auth_token \
  --trust_remote_code

(not having the last two doesn't throw any error but still runs (even slower) and return erroneous outputs)

@Vipitis Vipitis marked this pull request as draft July 20, 2023 00:28
@Vipitis
Copy link
Author

Vipitis commented Jul 20, 2023

converted to draft as development of the next tasks has started on this branch. Will try to add the other tasks when they are ready. I don't plan to change task1 but might improve the implementation.

@Vipitis Vipitis changed the title [WIP] adding ShaderEval task1 [WIP] adding ShaderEval tasks Oct 12, 2023
@Vipitis Vipitis closed this Oct 12, 2023
@Vipitis Vipitis deleted the ShaderEval_task1 branch October 12, 2023 18:55
@Vipitis
Copy link
Author

Vipitis commented Oct 12, 2023

closed due to cleaning up a bunch of stuff.
I will open a new draft PR in a few weeks that will hopefully provide a better implementation as well as documentation.
Plan currently is to have the data and implementation finished by the end of this year.

@Vipitis Vipitis mentioned this pull request Dec 16, 2023
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant