Feature extraction, feature binarization and image retrieval examples #161

kloudkl · 2014-02-25T17:03:45Z

This pull request is reopened(#141) to target the dev branch. It serves for two purposes.

First, Extracting data from a specific blob or layer by number index is very inconvenient. To satisfy the strong requirements of easy feature extraction, Has/Get Blob/Layer methods are added to simplify feature extraction. CAFFE represents Convolution Architecture For Feature Extraction. So let's have a feature extraction example.

Second, the very natural next step is to apply the extracted features in practical applications, e.g. image retrieval. Image retrieval is fastest when using binary features. But putting all the steps of a complete pipeline in an example is too complex. Thus a feature separate feature binarization example is split out.

The pipeline consisting of feature extraction, binarization and similarity searching has been tested to work well on the mnist prototxt and leveldb.

Related issues:
#20: Extract the middle features
#112: pythonic export of features and params for wrapper
#139: About dump_network.cpp

longjon · 2014-02-26T00:10:00Z

Why GetBlob but GetLayerByName? Also, should these not be lowercase per the Google C++ style guide? (which are we not following?)

kloudkl · 2014-02-26T01:50:47Z

There is a GetLayer in src/caffe/layer_factory.cpp which I do not want the users to be confused with the added one.

We are following the Google C++ style guide in particular the consistency and uniformity requirements excerpted from the background section of it.

"One way in which we keep the code base manageable is by enforcing consistency. It is very important that any programmer be able to look at another's code and quickly understand it. Maintaining a uniform style and following conventions means that we can more easily use "pattern-matching" to infer what various symbols are and what invariants are true about them. Creating common, required idioms and patterns makes code much easier to understand. In some cases there might be good arguments for changing certain style rules, but we nonetheless keep things as they are in order to preserve consistency."

shelhamer · 2014-02-26T08:27:41Z

Please rebase to fit the new dir arrangement so we can move this along. This'll be a nice example to include.

longjon · 2014-02-26T08:51:09Z

Indeed, consistency comes first. My suggestion is to use lowercase names for the getter-like methods, as this is already done for existing getters, like blob_names and layer_names, following http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml#Function_Names. (Note that the GetLayer function you refer to is not a getter.) My second suggestion is to include "by name" in either both names or neither. So, I would be happy with blob_by_name and layer_by_name, (and perhaps also has_blob and has_layer), for example.

kloudkl · 2014-02-26T14:11:36Z

@longjon, your suggestion on neat and consistent naming is adopted in dd97ce0. Thanks!

kloudkl · 2014-03-04T03:09:54Z

Rebased for code review.

palmforest · 2014-03-11T02:35:13Z

@kloudkl

In demo_extract-features.cpp, every time I run the code, the process stopped at line 123 "data_layer.Forward(bottom_vec_that_data_layer_does_not_need_, &top_vec);" Could you give a example command to run this demo?

Still in demo_extract_features.cpp, line 103 "if (num_layer = data_net_param.layers_size()) ", should it be "if (num_layer == data_net_param.layers_size()) " ?

Thanks

kloudkl · 2014-03-11T15:30:40Z

@palmforest, thanks for testing the code!

You must be using an older version of the PR. Please pull again to update.

demo_extract_features.cpp is now renamed to be extract_features.cpp and moved to the tools directory. I gave up using an independent data layer to pass data into the network extracting features and there is no longer data_net_param. The most recent version expects that the network contains a data layer. I opened a new PR #196 to pass data into the network from memory.

palmforest · 2014-03-12T10:08:40Z

Hi @kloudkl,

I downloaded the simple-feature-extraction version to try with
tools/extract_feature.cpp. However, when I use the trained imagenet model
and imagenet.prototxt to extract features I got errors as following:
(Unknown layer name: padding)

Could you help to see what's the problem is?

[palm@GPU01]$ LOG_logtostderr=1 build/tools/extract_features.bin
/home/palm/caffe_imagenet_model /home/tongping/imagenet.prototxt fc8
imagenet_features_val_leveldb_fc8 50
WARNING: Logging before InitGoogleLogging() is written to STDERR
E0312 18:03:57.049176 16385 extract_features.cpp:54] Using CPU
I0312 18:03:59.833039 16385 net.cpp:70] Creating Layer data
I0312 18:03:59.833089 16385 net.cpp:105] data -> data
I0312 18:03:59.833107 16385 net.cpp:105] data -> label
I0312 18:03:59.833161 16385 data_layer.cpp:136] Opening leveldb
/home/tongping/taobao-val-leveldb-v1
I0312 18:03:59.838238 16385 data_layer.cpp:174] output data size:
256,3,227,227
I0312 18:03:59.838258 16385 data_layer.cpp:191] Loading mean file
from/home/tongping/taobao.mean.binaryproto
I0312 18:03:59.923591 16385 net.cpp:120] Top shape: 3 227 227
I0312 18:03:59.923609 16385 net.cpp:120] Top shape: 1 1 1
I0312 18:03:59.923616 16385 net.cpp:146] data does not need backward
computation.
I0312 18:03:59.923630 16385 net.cpp:70] Creating Layer conv1
I0312 18:03:59.923637 16385 net.cpp:80] conv1 <- data
I0312 18:03:59.923656 16385 net.cpp:105] conv1 -> conv1
I0312 18:03:59.924268 16385 net.cpp:120] Top shape: 96 55 55
I0312 18:03:59.924279 16385 net.cpp:141] conv1 needs backward computation.
I0312 18:03:59.924289 16385 net.cpp:70] Creating Layer relu1
I0312 18:03:59.924295 16385 net.cpp:80] relu1 <- conv1
I0312 18:03:59.924303 16385 net.cpp:94] relu1 -> conv1 (in-place)
I0312 18:03:59.924314 16385 net.cpp:120] Top shape: 96 55 55
I0312 18:03:59.924319 16385 net.cpp:141] relu1 needs backward computation.
I0312 18:03:59.924326 16385 net.cpp:70] Creating Layer pool1
I0312 18:03:59.924331 16385 net.cpp:80] pool1 <- conv1
I0312 18:03:59.924345 16385 net.cpp:105] pool1 -> pool1
I0312 18:03:59.924365 16385 net.cpp:120] Top shape: 96 27 27
I0312 18:03:59.924371 16385 net.cpp:141] pool1 needs backward computation.
I0312 18:03:59.924379 16385 net.cpp:70] Creating Layer norm1
I0312 18:03:59.924386 16385 net.cpp:80] norm1 <- pool1
I0312 18:03:59.924394 16385 net.cpp:105] norm1 -> norm1
I0312 18:03:59.924407 16385 net.cpp:120] Top shape: 96 27 27
I0312 18:03:59.924412 16385 net.cpp:141] norm1 needs backward computation.
F0312 18:03:59.924424 16385 layer_factory.cpp:65] Unknown layer name:
padding
*** Check failure stack trace: ***
Aborted

On Tue, Mar 11, 2014 at 11:30 PM, kloudkl [email protected] wrote:

@palmforest https://github.com/palmforest, thanks for testing the code!

You must be using an older version of the PR. Please pull again to update.

demo_extract_features.cpp is now renamed to be extract_features.cpp and
moved to the tools directory. I gave up using an independent data layer to
pass data into the network extracting features and there is no longer
data_net_param. The most recent version expects that the network contains a
data layer. I opened a new PR #196 https://github.com/BVLC/caffe/pull/196to pass data into the network from memory.

Reply to this email directly or view it on GitHubhttps://github.com//pull/161#issuecomment-37308958
.

kloudkl · 2014-03-12T14:02:53Z

This happens after #128. The problem should be fixed by #170. Before that, please train new models by yourself.

sguada · 2014-03-12T15:47:44Z

@palmforest you don't need to train new models yourself (as @kloudkl) suggested, you can just edit the prototxt files and remove the padding layers, and add the corresponding pad to the convolutional layer. At some point we will have an automatic fixer #170

For example

layers {
  layer {
    name: "pad2"
    type: "padding"
    pad: 2
  }
  bottom: "norm1"
  top: "pad2"
}
layers {
  layer {
    name: "conv2"
    type: "conv"
    num_output: 256
    group: 2
    kernelsize: 5
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 1.
    }
    blobs_lr: 1.
    blobs_lr: 2.
    weight_decay: 1.
    weight_decay: 0.
  }
  bottom: "pad2"
  top: "conv2"
}

Becames

layers {
  layer {
    name: "conv2"
    type: "conv"
    num_output: 256
    group: 2
    kernelsize: 5
    pad: 2
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 1.
    }
    blobs_lr: 1.
    blobs_lr: 2.
    weight_decay: 1.
    weight_decay: 0.
  }
  bottom: "norm1"
  top: "conv2"
}

palmforest · 2014-03-13T09:54:20Z

Thanks! @kloudkl , @sguada

I followed sguada's instruction to modify the prototxt. The model can be
loaded correctly. However, it give error as following:
( target_blobs[j]->width() == source_layer.blobs(j).width() (1024 vs.
9216) ).

Do you know how to fix this?

I0313 17:55:09.560662 20043 net.cpp:120] Top shape: 1000 1 1
I0313 17:55:09.560678 20043 net.cpp:141] prob needs backward computation.
I0313 17:55:09.560684 20043 net.cpp:152] This network produces output label
I0313 17:55:09.560705 20043 net.cpp:152] This network produces output prob
I0313 17:55:09.560742 20043 net.cpp:168] Collecting Learning Rate and
Weight Decay.
I0313 17:55:09.560760 20043 net.cpp:162] Network initialization done.
F0313 17:55:09.565565 20043 net.cpp:283] Check failed:
target_blobs[j]->width() == source_layer.blobs(j).width() (1024 vs. 9216)
*** Check failure stack trace: ***
Aborted

On Wed, Mar 12, 2014 at 11:47 PM, Sergio Guadarrama <
[email protected]> wrote:

@palmforest https://github.com/palmforest you don't need to train new
models yourself (as @kloudkl https://github.com/kloudkl) suggested, you
can just edit the prototxt files and remove the padding layers, and add the
corresponding pad to the convolutional layer. At some point we will have an
automatic fixer #170 #170

For example

layers {
layer {
name: "pad2"
type: "padding"
pad: 2
}
bottom: "norm1"
top: "pad2"
}
layers {
layer {
name: "conv2"
type: "conv"
num_output: 256
group: 2
kernelsize: 5
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 1.
}
blobs_lr: 1.
blobs_lr: 2.
weight_decay: 1.
weight_decay: 0.
}
bottom: "pad2"
top: "conv2"
}

Becames

layers {
layer {
name: "conv2"
type: "conv"
num_output: 256
group: 2
kernelsize: 5
pad: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 1.
}
blobs_lr: 1.
blobs_lr: 2.
weight_decay: 1.
weight_decay: 0.
}
bottom: "norm1"
top: "conv2"
}

Reply to this email directly or view it on GitHubhttps://github.com//pull/161#issuecomment-37424578
.

sergeyk · 2014-03-13T19:00:21Z

@kloudkl, could you additionally provide a documented example of using this (something in the examples/ folder, such as an iPython notebook)

sergeyk · 2014-03-13T19:05:32Z

(To be clear, I don't suggest using the Python interface. A simple text/markdown file demonstrating running your tool will suffice -- this can go into docs/).

kloudkl · 2014-03-15T13:58:05Z

Ok, I will add a doc to show how to run the code with the pre-trained reference model.

kloudkl · 2014-03-17T10:32:38Z

@palmforest, the problem that you described was not met with when extracting features with the caffe_reference_imagenet_model using the steps documented in docs/feature_extraction.md. Would you like to try again to see if the doc helps?

@sergeyk, do you think the documentation appropriate?

shelhamer · 2014-03-18T06:37:11Z

models/get_caffe_reference_imagenet_model.sh

 echo "Downloading..."

-wget -q https://www.dropbox.com/s/n3jups0gr7uj0dv/caffe_reference_imagenet_model
+wget -q https://www.dropbox.com/s/n3jups0gr7uj0dv/


https://www.dropbox.com/s/n3jups0gr7uj0dv/$MODEL

rather than zero in the feature binarization example

sergeyk · 2014-03-19T19:52:43Z

I will review now.

sergeyk · 2014-03-20T04:15:38Z

Looks good enough for me. Upon merge (forthcoming, blocked by Evan's request to wait until some fix he is pushing), I will immediately push a commit to

signficantly change the documentation file
link to it from index.md
remove the image resizing script, since (a) it does not work, (b) is obviated by using ImagesLayer
add sample prototxt that uses ImagesLayer.

@kloudkl, it would be good if you contributed, in a separate PR, a section of the doc that shows how the stored feature is accessed from some other script or directly in leveldb.

Feature extraction, feature binarization and image retrieval examples

shelhamer · 2014-03-20T04:22:33Z

A workflow suggestion: instead of hitting the merge button then trying to commit and push before anyone else, you could check out this PR's branch and merge locally, then commit on top of it and push it all at once to dev.

Do whatever works for you though!

- signficantly change the documentation file - link to it from index.md - remove the image resizing script, since (a) it does not work, (b) is obviated by using ImagesLayer - add sample prototxt that uses ImagesLayer.

ghost · 2014-04-04T04:40:26Z

When I follow the steps and tried to run the feature extraction, I got an error saying:

build/tools/extract_features.bin models/caffe_reference_imagenet_model examples/images/imagenet_val.prototxt fc7 examples/feature_extraction/features 10
WARNING: Logging before InitGoogleLogging() is written to STDERR
E0404 00:38:19.440518 14421 extract_features.cpp:54] Using CPU
I0404 00:38:23.115815 14421 net.cpp:74] Creating Layer data
I0404 00:38:23.115854 14421 net.cpp:110] data -> data
I0404 00:38:23.115864 14421 net.cpp:110] data -> label
I0404 00:38:23.115869 14421 net.cpp:122] Setting up data
I0404 00:38:23.115900 14421 data_layer.cpp:139] Opening leveldb ./imagenet_256_256_leveldb
Segmentation fault (core dumped)

Would anyone have any idea what is causing this? Many thanks!

kloudkl · 2014-04-05T09:05:14Z

The error message suggests that the leveldb exists and you can delete it with rm -rf ./imagenet_256_256_leveldb.

I0404 00:38:23.115900 14421 data_layer.cpp:139] Opening leveldb ./imagenet_256_256_leveldb
Segmentation fault (core dumped)

ghost · 2014-04-07T14:09:24Z

Hi, sorry I am pretty confused....I am saving the leveldb file created to a different file now, but still getting the same error...

build/tools/extract_features.bin models/caffe_reference_imagenet_model examples/images/imagenet_val.prototxt fc7 ./save 10
WARNING: Logging before InitGoogleLogging() is written to STDERR
E0407 10:34:25.081058 1944 extract_features.cpp:54] Using CPU
I0407 10:34:28.782899 1944 net.cpp:74] Creating Layer data
I0407 10:34:28.782938 1944 net.cpp:110] data -> data
I0407 10:34:28.782948 1944 net.cpp:110] data -> label
I0407 10:34:28.782954 1944 net.cpp:122] Setting up data
I0407 10:34:28.782986 1944 data_layer.cpp:139] Opening leveldb ./images_256_256_leveldb

Segmentation fault (core dumped)

Thanks!

Hmmm ok I think there's something wrong with the data format I feed into the network..I've got an error message like this

Check failed: target_blobs[j]->width() == source_layer.blobs(j).width() (576 vs. 1024)

sguada · 2014-04-07T16:34:00Z

You need to create the leveldb before calling this function.

Sergio

2014-04-07 7:09 GMT-07:00 ttt_x [email protected]:

Hi, sorry I am pretty confused....isn't it supposed to exist? I thought
extract_features.bin loads the leveldb and the training model and computes
the features..?

If I deleted it I get an invalid argument error(which makes more sense to
me..)

build/tools/extract_features.bin models/caffe_reference_imagenet_model
examples/images/imagenet_val.prototxt fc7 examples/images/features 10

WARNING: Logging before InitGoogleLogging() is written to STDERR
E0407 10:07:24.676210 1584 extract_features.cpp:54] Using CPU
I0407 10:07:28.368263 1584 net.cpp:74] Creating Layer data
I0407 10:07:28.368309 1584 net.cpp:110] data -> data
I0407 10:07:28.368319 1584 net.cpp:110] data -> label
I0407 10:07:28.368327 1584 net.cpp:122] Setting up data
I0407 10:07:28.368357 1584 data_layer.cpp:139] Opening leveldb
(test)./images_256_256_leveldb
I0407 10:07:28.368646 1584 data_layer.cpp:142] opening is successful to
this point
F0407 10:07:28.368670 1584 data_layer.cpp:143] Check failed: status.ok()
Failed to open leveldb ./images_256_256_leveldb
Invalid argument: ./images_256_256_leveldb: does not exist
(create_if_missing is false)

*** Check failure stack trace: ***
Aborted (core dumped)

Thanks!

Reply to this email directly or view it on GitHubhttps://github.com//pull/161#issuecomment-39734065
.

Feature extraction, feature binarization and image retrieval examples

- signficantly change the documentation file - link to it from index.md - remove the image resizing script, since (a) it does not work, (b) is obviated by using ImagesLayer - add sample prototxt that uses ImagesLayer.

Explicit class mapping in DetectNet

shelhamer added enhancement labels Feb 25, 2014

longjon mentioned this pull request Mar 11, 2014

Update Python interface, using OrderedDict for blobs and layers #199

Merged

kloudkl mentioned this pull request Mar 11, 2014

Add more convenience math functions and all tests pass #201

Merged

shelhamer assigned sergeyk Mar 13, 2014

kloudkl mentioned this pull request Mar 15, 2014

About dump_network.cpp #139

Closed

This was referenced Mar 17, 2014

How to run a pretrained model on CPU-only machine #211

Closed

Util functions converting formats between HDF5 and Blob #220

Closed

shelhamer reviewed Mar 18, 2014
View reviewed changes

kloudkl added 14 commits March 19, 2014 23:04

Fix bugs in the image retrieval example

23eecde

Fix saving real valued feature bug in the feature extraction example

dd13fa0

Change feature binarization threshold to be the mean of all the values

706a926

rather than zero in the feature binarization example

Save and load data correctly in feat extracion, binarization and IR demo

f97e87b

Move extract_features, binarize_features, retrieve_images to tools/

c60d551

Use lowercase underscore naming convention for Net blob & layer getters

8e7153b

Fix cpplint errors for Net, its tests and feature related 3 examples

5bcdebd

Don't create a new batch after all the feature vectors have been saved

6a60795

Add a python script to generate a list of all the files in a directory

25b6bcc

Add documentation for the feature extraction demo

a2ad3c7

Move binarize_features, retrieve_images to examples/feauture_extraction

a967cf5

Removing feature binarization and image retrieval examples

44ebe29

Change generate file list python script path in feature extraction doc

c7201f7

Explain how to get the mean image of ILSVRC

72c8c9e

kloudkl mentioned this pull request Mar 19, 2014

Image retrieval example #243

Closed

sergeyk added a commit that referenced this pull request Mar 20, 2014

Merge pull request #161 from kloudkl/simplify_feature_extraction

c10ba54

Feature extraction, feature binarization and image retrieval examples

sergeyk merged commit c10ba54 into BVLC:dev Mar 20, 2014

kloudkl mentioned this pull request Apr 3, 2014

question: viewing cifar-10 prediction #281

Closed

kloudkl deleted the simplify_feature_extraction branch April 4, 2014 08:46

mitmul pushed a commit to mitmul/caffe that referenced this pull request Sep 30, 2014

Merge pull request BVLC#161 from kloudkl/simplify_feature_extraction

d074b29

Feature extraction, feature binarization and image retrieval examples

lukeyeager pushed a commit to lukeyeager/caffe that referenced this pull request Jun 8, 2016

Merge pull request BVLC#161 from gheinrich/dev/explicit-class-mapping

846f3e4

Explicit class mapping in DetectNet

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature extraction, feature binarization and image retrieval examples #161

Feature extraction, feature binarization and image retrieval examples #161

kloudkl commented Feb 25, 2014

longjon commented Feb 26, 2014

kloudkl commented Feb 26, 2014

shelhamer commented Feb 26, 2014

longjon commented Feb 26, 2014

kloudkl commented Feb 26, 2014

kloudkl commented Mar 4, 2014

palmforest commented Mar 11, 2014

kloudkl commented Mar 11, 2014

palmforest commented Mar 12, 2014

kloudkl commented Mar 12, 2014

sguada commented Mar 12, 2014

palmforest commented Mar 13, 2014

sergeyk commented Mar 13, 2014

sergeyk commented Mar 13, 2014

kloudkl commented Mar 15, 2014

kloudkl commented Mar 17, 2014

shelhamer Mar 18, 2014

sergeyk commented Mar 19, 2014

sergeyk commented Mar 20, 2014

shelhamer commented Mar 20, 2014

ghost commented Apr 4, 2014

kloudkl commented Apr 5, 2014

ghost commented Apr 7, 2014

sguada commented Apr 7, 2014

Feature extraction, feature binarization and image retrieval examples #161

Feature extraction, feature binarization and image retrieval examples #161

Conversation

kloudkl commented Feb 25, 2014

longjon commented Feb 26, 2014

kloudkl commented Feb 26, 2014

shelhamer commented Feb 26, 2014

longjon commented Feb 26, 2014

kloudkl commented Feb 26, 2014

kloudkl commented Mar 4, 2014

palmforest commented Mar 11, 2014

kloudkl commented Mar 11, 2014

palmforest commented Mar 12, 2014

kloudkl commented Mar 12, 2014

sguada commented Mar 12, 2014

palmforest commented Mar 13, 2014

sergeyk commented Mar 13, 2014

sergeyk commented Mar 13, 2014

kloudkl commented Mar 15, 2014

kloudkl commented Mar 17, 2014

shelhamer Mar 18, 2014

Choose a reason for hiding this comment

sergeyk commented Mar 19, 2014

sergeyk commented Mar 20, 2014

shelhamer commented Mar 20, 2014

ghost commented Apr 4, 2014

kloudkl commented Apr 5, 2014

ghost commented Apr 7, 2014

sguada commented Apr 7, 2014