Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metadata needed for level of semantic validation for QA pairs #15

Open
jh111 opened this issue Jan 31, 2024 · 2 comments
Open

Metadata needed for level of semantic validation for QA pairs #15

jh111 opened this issue Jan 31, 2024 · 2 comments
Assignees

Comments

@jh111
Copy link
Collaborator

jh111 commented Jan 31, 2024

Kara is adding helping scale the number of SME-free test assets.

We need is one (or more) columns in testingAssetsfor50Github_2024-01-02_20240102evaluation.xlsx to represent the level of validation, and guidelines for values that go in that column.

Currently, most QA pairs hav validation SME or SmuRF. New QA pairs may have no validation yet (but are still useful for delta), or "came from Chat GPT".

In a related note, we have to decide if this xls is going to remain the one place for all QA pairs, or if we want multiple spreadsheets and/or a database.

@karafecho
Copy link

karafecho commented Jan 31, 2024

Here is a link to the G-sheet. (I think the one above may be broken.)

I added Column J, "Method of Generation", and included a dropdown menu with seletions of: Manual, SMuRF; Manual, SME; Automated, LLM; Other (please specify). I also added Column K, "Level of Validation", and included a dropdown menu with selections of: SMuRF; SME; No validation; Other/multiple (please specify). There may be a better way of capturing the desired information, so please feel free to delete the columns and suggest another approach.

Question for @jh111 : I was viewing ChatGPT as a method for (quickly) generating assets, but not necessarily validating them. I think that's what you mean, right? If not, then the approach I suggested will not work.

@jh111
Copy link
Collaborator Author

jh111 commented Jan 31, 2024

Your two column approach works well and I agree with ChatGPT generated, No Validation.

I supposed in the future we could have other types of validation. We could have ChatGPT vX confirmed, but I'm not sure how heavily I'd weigh that validation...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants