Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added the required files to generate a instructional python dataset… #3106

Merged
merged 4 commits into from
May 13, 2023
Merged

Added the required files to generate a instructional python dataset… #3106

merged 4 commits into from
May 13, 2023

Conversation

Nan-Do
Copy link
Contributor

@Nan-Do Nan-Do commented May 9, 2023

… and updated init.py. This solves #297

@olliestanley
Copy link
Collaborator

How good is the model generating these summaries? Can you provide any links to examples etc?

@Nan-Do
Copy link
Contributor Author

Nan-Do commented May 9, 2023

@olliestanley in my opinion the quality for the python model is pretty high, it generates a human-like response almost all of the times, perhaps you could argue it's a little bit short tho. Also, the code-search-net is a pretty easy dataset as all the functions include documentation so the output is usually a short version version of the docstring. I have also uploaded a version of the code-search-net with the summaries included if you want to check the quality output. The dataset can be found here

Copy link
Collaborator

@olliestanley olliestanley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I have been through some of the data and left a note in the readme for anyone working with it, but overall the data looks pretty good ;)

@olliestanley olliestanley enabled auto-merge (squash) May 13, 2023 08:29
@olliestanley olliestanley linked an issue May 13, 2023 that may be closed by this pull request
@olliestanley olliestanley merged commit 24856cd into LAION-AI:main May 13, 2023
@Nan-Do Nan-Do deleted the instructional-codesearchnet-python branch May 14, 2023 14:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Code Instructions using data augmentation
3 participants