Multi-Resolution CNNs for Large-Scale Scene Recognition

Here we provide the code and models for the following paper (Arxiv Preprint):

Knowledge Guided Disambiguation for Large-Scale Scene Classification with Multi-Resolution CNNs
Limin Wang, Sheng Guo, Weilin Huang, Yuanjun Xiong, and Yu Qiao 
in IEEE Transactions on Image Processing, 2017

Updates

February 21st, 2017
- Release the code and models
January 3rd, 2017
- Initialize the repo

Overview

We have made two efforts to exploit CNNs for large-scale scene recognition:

We design a modular framework to capture multi-level visual information for scene understanding by training CNNs from different resolutions
We propose a knowledge disambiguation strategy by using soft labels from extra networks to deal with the label ambiguity issue of scene recognition.

These two efforts are the core part of team "SIAT_MMLAB" for the following large-scale scene recogntion challenges.

Challenge	Rank	Performance
Places2 challenge 2015	2nd place	0.1736 top5-error
Places2 challenge 2016	4th place	0.1042 top5-error
LSUN challenge 2015	2nd place	0.9030 top1-accuracy
LSUN challenge 2016	1st place	0.9161 top1-accuracy

Places365 Models

We first release the learned models on the Places365 dataset.

Models learned at resolution of 256 * 256

Model	Top5 Error Rate
(A0) Normal BN-Inception	0.143
(A1) Normal BN-Inception + object networks	0.141
(A2) Normal BN-Inception + scene networks	0.134

Models learned at resolution of 384 * 384

Model	Top5 Error Rate
(B0) Deeper BN-Inception	0.140
(B1) Deeper BN-Inception + object networks	0.136
(B2) Deeper BN-Inception + scene networks	0.130

Download initialization and reference models

We release the scripts at the directory of scripts/.

Try bash scripts/get_init_models.sh to downdload knowldege models.

Try bash scripts/get_reference_models.sh to download reference models.

Testing Code

We release the testing code on the Places365 validation dataset at the directory of matlab/.

We also release a demo code to use our Places365 model as generic feature extraction and perform scene recognition on the MIT Indoor67 dataset at the directory of matlab/.

Training Code

We release the models at the directory of models/ and the training scripts at the directory of scripts/.

Try bash scripts/256_inception2_train.sh to train standard CNNs.

Try bash scripts/256_kd_object_inception2_train.sh to train knowledge disambiguation networks (by object network).

Try bash scripts/256_kd_scene_inception2_train.sh to train knowledge disambiguation netowrks (by scene network).

The training code is based on our modified Caffe toolbox. It is a efficient parallel caffe with MPI implementation. Meanwhile, we implement a new kl-divergence loss layer for our knowledge disambiguation methods;

https://github.com/yjxiong/caffe/tree/kd

Questions

Contact

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
matlab		matlab
models		models
scripts		scripts
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi-Resolution CNNs for Large-Scale Scene Recognition

Updates

Overview

Places365 Models

Testing Code

Training Code

Questions

About

Releases

Packages

Contributors 3

Languages

wanglimin/MRCNN-Scene-Recognition

Folders and files

Latest commit

History

Repository files navigation

Multi-Resolution CNNs for Large-Scale Scene Recognition

Updates

Overview

Places365 Models

Testing Code

Training Code

Questions

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages