Replies: 1 comment
-
If you want to use only part of the ImageFolder, you will probably need to create a new Dataset class Instead, consider using the loss weights. This is the strategy that I have usually seen. You use all of your data, but multiply the loss values for your small class by some constant value (between 1 and 80) to make them more important. This should counterbalance against the data quantity difference to avoid having skewed predictions. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Question
Hi, I've just prepared my image dataset for training, but realized one particular issue of it - one class got like 80k examples, when other has only 1k. Is there possibility of taking only 1k examples of bigger class at random in ImageFolder dataset? Or loading all, but then for training using only fixed number of examples from each class?
It's like in python splitting dataset with classes ratio fixed, but on the moment of loading dataset, as I've got it splitted for training and testing.
Beta Was this translation helpful? Give feedback.
All reactions