Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow RatingQuestion to take value 0 #4822

Closed
5 tasks done
MoritzLaurer opened this issue May 13, 2024 · 3 comments · Fixed by #4864
Closed
5 tasks done

Allow RatingQuestion to take value 0 #4822

MoritzLaurer opened this issue May 13, 2024 · 3 comments · Fixed by #4864
Assignees
Milestone

Comments

@MoritzLaurer
Copy link
Contributor

MoritzLaurer commented May 13, 2024

small feedback on rg.RatingQuestion: at the moment, RatingQuestions are not allowed to take the value 0. There has been some recent literature on LLM-as-a-judge where people use cumulative prompts/instructions and you award points to a response for each fulfilled criterion. These prompt types can also result in 0 points if no criterion is fulfilled. At the moment, it's not possible to use this type of prompt with Argilla's RatingQuestions, because the minimum value always has to be at least 1. Is there added value to enforcing a minimum value of 1? Being able to set values=[0, 1, 2, 3, 4, 5] for these types of instructions would be useful.

dataset_argilla = rg.FeedbackDataset(
    fields=[
        rg.TextField(name="content", use_markdown=True, required=True),
    ],
    questions=[
        rg.RatingQuestion(
            name="rating_criterion_1",
            description="Some quality rating on a scale from 0 to 5.",
            required=True,
            values=[0, 1, 2, 3, 4, 5]
        )
    ]
)
ValidationError: 1 validation error for RatingQuestion
values -> 0
  ensure this value is greater than or equal to 1 (type=value_error.number.not_ge; limit_value=1)
@nataliaElv nataliaElv added this to the v1.29.0 milestone May 13, 2024
@burtenshaw
Copy link
Contributor

Thanks for the proposal @MoritzLaurer .

Out of interest, could 0 be communicated by a None answer in a required=False rating question? The advantage would be that it saves your annotator time.

@MoritzLaurer
Copy link
Contributor Author

MoritzLaurer commented May 14, 2024

mh.. I think explicitly selecting "0 points" is important for these prompts. Here is an example:

eval_prompt_cumulative = f"""\
Your task is to evaluate the quality of the image \
based on the color scheme and how well the colors harmonize with the content of the image.

First reason step by step for your evaluation and then return a quality score.

Instructions for scoring on a cumulative scale from 0 to 2: 
- Award zero points if the colors neither harmonize with each other nor with the content of the image.  
- Add one point if the the colors harmonize well with each other
- Add one point if the colors harmonize well with the content of the image

Use the following JSON schema:\n{schema_simplified}
"""

0 points is an explicit option, so I think that would be best for annotators to select it explicitly.

@burtenshaw
Copy link
Contributor

Thanks for the clarification @MoritzLaurer

I agree that it makes sense for the annotator to specify 0 and that because it is 'cumulative' it should be a RatingQuestion' and not LabelQuestion` with 0 as a label.

@nataliaElv We should clarify in our docs the difference between a 0 and null response. We could also give examples like this one compared to a 1-5 rating.

@jfcalvo jfcalvo self-assigned this May 20, 2024
jfcalvo added a commit that referenced this issue May 23, 2024
# Description

This PR is the feature branch with all the changes to support zero value
on rating questions.

Closes #4822 

**Type of change**

(Please delete options that are not relevant. Remember to title the PR
according to the type of change)

- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to not work as expected)
- [ ] Refactor (change restructuring the codebase without changing
functionality)
- [x] Improvement (change adding some improvement to an existing
functionality)
- [ ] Documentation update

**How Has This Been Tested**

(Please describe the tests that you ran to verify your changes. And
ideally, reference `tests`)

- [x] Adding and changing tests.

**Checklist**

- [ ] I added relevant documentation
- [ ] follows the style guidelines of this project
- [ ] I did a self-review of my code
- [ ] I made corresponding changes to the documentation
- [ ] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I filled out [the contributor form](https://tally.so/r/n9XrxK)
(see text above)
- [ ] I have added relevant notes to the CHANGELOG.md file (See
https://keepachangelog.com/)

# Argilla Community Growers

Thanks for your contribution! As part of our Community Growers
initiative 🌱, we're donating Justdiggit bunds in your name to reforest
sub-Saharan Africa. To claim your Community Growers certificate, please
contact David Berenstein in our Slack community or fill in this form
https://tally.so/r/n9XrxK once your PR has been merged.

# Pull Request Templates

Please go the the `Preview` tab and select the appropriate sub-template:

* [🐞-bug](?expand=1&template=bug.md)
* [📚-documentation](?expand=1&template=docs.md)
* [🆕-features](?expand=1&template=features.md)

# Generic Pull Request Template

Please include a summary of the changes and the related issue. Please
also include relevant motivation and context. List any dependencies that
are required for this change.

Closes #<issue_number>

**Type of change**

(Please delete options that are not relevant. Remember to title the PR
according to the type of change)

- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to not work as expected)
- [ ] Refactor (change restructuring the codebase without changing
functionality)
- [ ] Improvement (change adding some improvement to an existing
functionality)
- [ ] Documentation update

**How Has This Been Tested**

(Please describe the tests that you ran to verify your changes. And
ideally, reference `tests`)

- [ ] Test A
- [ ] Test B

**Checklist**

- [ ] I added relevant documentation
- [ ] follows the style guidelines of this project
- [ ] I did a self-review of my code
- [ ] I made corresponding changes to the documentation
- [ ] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I filled out [the contributor form](https://tally.so/r/n9XrxK)
(see text above)
- [ ] I have added relevant notes to the CHANGELOG.md file (See
https://keepachangelog.com/)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants