-
Notifications
You must be signed in to change notification settings - Fork 456
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add multi-label classification dataset and metric #1572
Conversation
- Add MultiLabel annotation support - Refactor de/serialize annotation with AnnotationRaw - Add ImageFolderDataset::with_items methods
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1572 +/- ##
==========================================
- Coverage 86.53% 86.34% -0.19%
==========================================
Files 684 687 +3
Lines 78248 78685 +437
==========================================
+ Hits 67713 67943 +230
- Misses 10535 10742 +207 ☔ View full report in Codecov by Sentry. |
fn bin_config() -> bincode::config::Configuration { | ||
bincode::config::standard() | ||
} | ||
|
||
fn encode(&self) -> Vec<u8> { | ||
bincode::serde::encode_to_vec(self, Self::bin_config()).unwrap() | ||
} | ||
|
||
fn decode(annotation: &[u8]) -> Self { | ||
let (annotation, _): (AnnotationRaw, usize) = | ||
bincode::serde::decode_from_slice(annotation, Self::bin_config()).unwrap(); | ||
annotation | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use the serialization for what exactly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We decided on having annotations as bytes
struct ImageDatasetItemRaw {
/// Image path.
image_path: PathBuf,
/// Image annotation.
/// The annotation bytes can represent a string (category name) or path to annotation file.
annotation: Vec<u8>,
}
But now that you mention it... I don't see any need for serialization just to have bytes 😅 we could simply change the annotation type in ImageDatasetItemRaw
to the AnnotationRaw
enum. And scrap the encode/decode.
Probably needed another coffee when I went over this part ☕
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Just one change.
Pull Request Template
Checklist
run-checks all
script has been executed.Related Issues/PRs
Progress towards #1526.
Fine-tuning example is almost complete (locally).
Changes
Added
ImageFolderDataset::new_multilabel_classification_with_items
andHammingScore
multi-label accuracy metricAnnotation::MultiLabel(Vec<usize>)
for multi-label classificationAnnotationRaw
enum to de/serialize different supported annotation types withbincode
ImageFolderDataset
new methods to usewith_items
HammingScore
metric andMultiLabelClassificationOutput
to handle multi-label outputsTesting
New unit tests for dataset methods and hamming score metric.