Fashion200K dataset used in ICCV'17 paper "Automatic Spatially-aware Fashion Concept Discovery." [paper]
Author: Xintong Han
Contact: [email protected]
This dataset is crawled from Lyst.com in September 2016.
You can download the dataset via Google Drive:
image_urls.txt: Image urls. Format: ImageName ImageURL.
detection: Detection results. Format: ImageName Category_DetectionScore_xMin_xMax_yMin_yMax. We trained a MultiBox detector for 9 classes (dress, skirt, top, bag, shorts, sunglasses, shoe, outwear, pants).
women.tar.gz: Cropped detected images. If you want the original images, they can be downloaded using their urls in image_urls.txt.
labels: train/test labels. Format: ImageName DetectionScore ProductDescription.
Note that there is information of more than 300k images in image_urls.txt and detection folder, but we remove around 100k images (see labels folder) because they have low detection scores.
You can refer to Tensorflow's im2txt for how to train the model. By setting f_rnn_loss_factor and g_rnn_loss_factor to 0 in the model configuration in this repo can also train a visual-semantic embedding for the same purpose.
You can look at the original CAM code to figure out how to extract activation maps in a joint embedding setting.
@inproceedings{han2017automatic,
title = {Automatic Spatially-aware Fashion Concept Discovery},
author = {Han, Xintong and Wu, Zuxuan and Huang, Phoenix X. and Zhang, Xiao and Zhu, Menglong and Li, Yuan and Zhao, Yang and Davis, Larry S.},
booktitle = {ICCV},
year = {2017},
}