Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support N-Level judgements (instead of 3) #109

Closed
anthonygroves opened this issue Feb 12, 2020 · 10 comments
Closed

Support N-Level judgements (instead of 3) #109

anthonygroves opened this issue Feb 12, 2020 · 10 comments
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@anthonygroves
Copy link

Currently, RRE only supports a three-level judgement scale. But docs here mention that this will likely be generalized in future versions: https://github.com/SeaseLtd/rated-ranking-evaluator/wiki/What%20We%20Need%20To%20Provide

My personal need is to use RRE with a judgement scale of 4 or 5, but I can see how others may want it to support any level up to 10. It would be nice if the RRE user could easily configure which N-level judgement scale to use, up to a reasonable (10?) value.

@agazzarini agazzarini self-assigned this Feb 12, 2020
@agazzarini agazzarini added the enhancement New feature or request label Feb 12, 2020
@agazzarini
Copy link
Member

Thanks for entering this @anthonygroves. I completely agree with your need and with the generalisation of the judgement scale.

@epugh
Copy link
Contributor

epugh commented Mar 10, 2020

I wanted to mention that I am looking for a 4 point scale versus three as well. I think I can contribute some development effort if you can maybe give me some pointers?

@agazzarini
Copy link
Member

Hi @epugh, judgments are gathered in

io.sease.rre.Func::gainOrRatingNode

and looking at its usage I see

Screenshot 2020-03-10 at 22 06 30

As you can see callers are basically metrics and Query class.
The only doubts I have are:

  • is it better to configure the scale somewhere (e.g. pom.xml) or to have a preliminary scan of the judgments in order to automatically determine the scale? Or both (e.g. if the configuration is not found then it does the automatic thing)
  • the current code assigns 2 (the average) in case the rating is not present. Does the average make always sense?

@mattflax
Copy link
Contributor

Hi @agazzarini,

@epugh asked me to take a look at this. My thoughts would be:

  • set constant default values of maxgrade=3, default=2 where the judgement is missing (matching current behaviour);
  • add the max grade as an optional construction parameter for NDCG, RR (and make it optional for ERR);
  • add the default grade (average) as an optional construction parameter for ERR, NDCG, RR (ERR currently calculates it as maxgrade/2, rounded to the nearest int).

This allows overriding all defaults using the ParameterizedMetric mechanism, and works for all the metrics. It doesn't work for the Query class though.

A further step, and this might be better as a separate issue/PR, might be to:

  • allow setting the max and default grades in the top-level pom.xml configuration;
  • allow overrides via the ParameterizedMetric mechanism, so users can set both "pessimistic" and "optimistic" defaults;
  • handle metric naming when both pessimistic and optimistic defaults are set for the same metric (currently I think you'd get two ERR@10 metrics, though I haven't checked).

How does this sound? I'm happy to put together PRs for these if they seem like sensible suggestions.

@renekrie
Copy link

(I thought I had left a comment earlier but it seems lost - sorry for re-posting if my old comment re-appears.)

I was wondering if we could use this rework of grades to allow for decimal numbers and grades between 0 and 1 as well. This is often what we end up with when we derive judgments/grades from tracking. Mapping them to integers/buckets would add a second, somehow arbitrary model.

@mattflax
Copy link
Contributor

This should be possible - most of the calculations use BigDecimal (floating point) objects already, although both the DCG and ReciprocalRank calculations convert the grade to an integer before using it. Would it be useful to specify the scale value as well? I can see both 2 and 8 being used, depending on the metric, though I don't know why those choices were made.

mattflax pushed a commit to mattflax/rated-ranking-evaluator that referenced this issue Mar 16, 2020
- Make fairgrade configurable at construction time;
- Use floating point grade values in gain();
- Add default grade values to Metric.
mattflax pushed a commit to mattflax/rated-ranking-evaluator that referenced this issue Mar 16, 2020
- Make maxgrade and fairgrade configurable at construction time;
- Use floating point grade values in gain function.
mattflax pushed a commit to mattflax/rated-ranking-evaluator that referenced this issue Mar 16, 2020
mattflax pushed a commit to mattflax/rated-ranking-evaluator that referenced this issue Mar 16, 2020
mattflax pushed a commit to mattflax/rated-ranking-evaluator that referenced this issue Mar 16, 2020
…de to be set via Maven pom.xml:

- Add singleton factory for instantiating MetricClassManager instances;
- Allow access to default grade values via factory class.
- Modify Maven plugins to use factory.
mattflax pushed a commit to mattflax/rated-ranking-evaluator that referenced this issue Mar 16, 2020
mattflax pushed a commit to mattflax/rated-ranking-evaluator that referenced this issue Mar 16, 2020
mattflax pushed a commit to mattflax/rated-ranking-evaluator that referenced this issue Mar 16, 2020
@renekrie
Copy link

This should be possible - most of the calculations use BigDecimal (floating point) objects already, although both the DCG and ReciprocalRank calculations convert the grade to an integer before using it. Would it be useful to specify the scale value as well? I can see both 2 and 8 being used, depending on the metric, though I don't know why those choices were made.

I think scale wouldn't harm as an optional parameter. It's probably a difficult choice to find the right value - I guess hardly anyone has an opinion about what would be right - but at least you provide the flexibility.

Yet another thought on floats: depending on the implementation of the metrics we should be careful with judgment values between 0 and 1. If we take the logarithm of this we might end up with negative values but I guess none of the metrics would do that?!

mattflax pushed a commit to mattflax/rated-ranking-evaluator that referenced this issue Mar 17, 2020
@mattflax
Copy link
Contributor

@renekrie I've skipped making scale configurable for now - the PR was getting big, and it feels like something that could be put in a separate issue if it becomes a requirement.

mattflax pushed a commit to mattflax/rated-ranking-evaluator that referenced this issue Mar 17, 2020
…RR, NDCG, RR to add consistency with pom.xml parameters.
mattflax pushed a commit to mattflax/rated-ranking-evaluator that referenced this issue Mar 22, 2020
agazzarini added a commit that referenced this issue Mar 25, 2020
#109: Make maximum and default grades configurable.
@epugh
Copy link
Contributor

epugh commented Mar 25, 2020

Is this cloasable @agazzarini with the merge?

@agazzarini
Copy link
Member

agazzarini commented Mar 25, 2020 via email

@agazzarini agazzarini added this to the 1.1 milestone Apr 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants