-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] NCL - class should be cleaned if number of sampes is 0.5 * minority samples, not if 0.5* data.shape[0] #764
Comments
I renamed the issue, because after reading the paper further, my original interpretation was wrong, and the implementation in imbalanced learn reflects what is proposed in the paper. Apart from the criteria to exclude observations from the cleaning procedure. |
@glemaitre @chkoar was this parameter set up as a Otherwise, I am happy to pick this up. Pls let me know. |
It corresponds to I will add some additional tests now but the algorithm looks fine to me. |
Oh no, I see your point. Indeed, it should be the minority class indeed. |
Describe the bug
Neighbourhood cleaning rule procedure:
if ( x Ci in 3-nearest neighbors of misclassified y C )
and ( | Ci | ‡ 0.5 · | C | ) then A2 = { x } A2
The above is a copy of the pseudo code in the article. There, C is the minority class or class of interest.
Further quote what is on the article:
"To avoid excessive reduction of small classes, only examples from classes larger or equal to 0.5 * | C | are considered while forming A2. " and it previously mentions that C is the minority. They refer to the entire dataset as T.
The text was updated successfully, but these errors were encountered: