-
Notifications
You must be signed in to change notification settings - Fork 312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Only evaluate spambug on bugs filed by people without "editbugs" permissions, then check if it's better to train on all or just non-editbugs #2787
Comments
Trying to wrap myself around this issue, and correct me if I'm wrong: |
The goal of this issue is two fold:
@jpangas unfortunately this issue might be problematic for you to fix, as you'd need special permissions to see which users have editbugs permissions. |
A workaround could be checking if the user's email belongs to a Mozilla employee or not (e.g., ends with This will not catch all cases, but it could perform better in the context of the training dataset (item 2) since it will catch cases such a bug was filled with users who had editbugs permissions but not anymore. In the context of item 1, depending on the editbugs permissions will show more realistic results. @marco-c wdyt? |
in 2) Did you mean when we train on all bugs vs when we train on bugs filed by people with editbugs or you actually meant to say when we train on bugs filed by people with non-editbugs permissions only. (which we do currently). bugbug/bugbug/models/spambug.py Lines 87 to 89 in f990605
Currently we train only on bugs filed by people with non-mozillians (I assume these people have non-editbugs permissions.) This would be one of the ways we can test out performance when we include bugs filed by mozillians. (inline with what @suhaibmujahid has suggested.) |
Yes sorry, I meant train on bugs filed by people without editbugs permissions. For evaluation we should always skip them, as we are doing it in production and we want to measure exactly what happens on production. |
You can retrieve the list of users with editbugs by doing In the model, we could do something like (pseudocode):
P.S.: as part of this, we should also skip people with "@Softvision" in their email address. |
Great. Thanks, I'm on it and I will open a PR once everything is ready. |
@jpangas we already have a feature to check if the user is a mozillian, but it is not used in the spambug model: Lines 177 to 184 in f990605
|
Thanks @suhaibmujahid |
Depends on #4407. |
The spambug model is only applied to bugs filed by people without "editbugs" permissions, so it makes sense to only evaluate it on these kinds of bugs and not all bugs.
For training, we can keep using all bugs, but we should check if the performance improves or worsens in case we only use bugs filed by non-editbugs people.
The text was updated successfully, but these errors were encountered: