Image Classification Dataset Cleaner

This repo provides a generic framework based on React for cleaning up noisy image classification datasets. It was originally an internal dataset cleaner for a dataset with less than 10 classes. We removed the hard-coded dataset-specific part to make it more easy to be adapted to other datasets.

It uses predictions from a trained model to identify examples whose prediction is most off (it can be a very hard example or just simply has a wrong label).

(Note: this project is still in the early development stage. It has some basic functionality, but expect to do some work if you need more advanced features.)

Usage

Required files (For more details, see the Imagenette example.):

valid_preds.csv: model predictions for the validation set.
id_to_name_map.json: mapping class indices to class names.
image files: we directly serve the image files with the React site. Please make sure the path on the HTTP server is the same as in valid_preds.csv. (You can use symlinks to avoid copying data.)

Click the "Remove" button on the image that has a wrong label. Click "Export" when you're done or wish to save the progress. Save the JSON file as valid-changes.json to the build/ folder (use the public/ folder when in development mode). The exported file should be used to actually removes the noisy examples from the dataset.

The cleaner will load valid-changes.json if exists to restore the exported annotations.

Imagenette Example

Firstly, run npm run build to create an optimized build at build/.

Download the example tar file and extract it into the build/ folder (e.g., tar zxf imagenette_example.tar.gz -C build).

Serve the optimized build (e.g., http-server --cors -p 8080 build/) and access it in the browser (e.g., http://localhost:8080).

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
public		public
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
screenshot.png		screenshot.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image Classification Dataset Cleaner

Usage

Imagenette Example

About

Releases

Packages

Languages

License

veritable-tech/image-classification-dataset-cleaner

Folders and files

Latest commit

History

Repository files navigation

Image Classification Dataset Cleaner

Usage

Imagenette Example

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages