Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

input and output tensors. #60

Open
slimcdk opened this issue Apr 3, 2019 · 5 comments
Open

input and output tensors. #60

slimcdk opened this issue Apr 3, 2019 · 5 comments

Comments

@slimcdk
Copy link

slimcdk commented Apr 3, 2019

I'm in the progress of converting the model to Tensorflow Lite, but I'm not very experienced with Tensorflow yet.

For the conversion I need to use the input and output tensor sizes. Where am I able to find those?

Will the input be the image size and color channels? Eg [None, FLAGS.input_size, FLAGS.input_size, 3] ?
And for output, would that be just the num_of_joints number?

To clarify my question, I'm using the second code snippet provided by Pannag Sanketi : https://stackoverflow.com/questions/50632152/tensorflow-convert-pb-file-to-tflite-using-python

@hkawii
Copy link

hkawii commented May 12, 2019

Hello @slimcdk did you find out how to convert it to tensorflow lite yet ?
Im searching for any way to do that but dont know where to start

@slimcdk
Copy link
Author

slimcdk commented May 12, 2019

Hi

Yea, take a look at my fork of the repo: https://github.com/slimcdk/convolutional-pose-machines-tensorflow
I found that the first tensor has a misspelling in the provided weights, which is corrected in the model source code.

I did also manage to do inference on the model, but processing time were between 2-3 seconds, on a Galaxy S10. I still need to create the kalman filter and possible the tracker module, or you could just feed the model with a fixed resolution.

@hkawii
Copy link

hkawii commented May 25, 2019

Hello @slimcdk , really thank you so much for your help finally I managed to convert it to a tflite file thanks to your comment

I also managed to make an inference on iOS device and processing time is better there

But there is a problem with the output,
How can we get the labels from the output there, forgive me Im completely new to that field and would appreciate any help

This is an example for the output

[[[[ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   ...
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]]

  [[ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   ...
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]]

  [[ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   ...
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]]

  ...

  [[ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   ...
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]]

  [[ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   ...
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]]

  [[ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   ...
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]]]]


@slimcdk
Copy link
Author

slimcdk commented May 26, 2019

If you think about how the three color channels (red, green, blue) in regular images form a stacked layer approach.

The output of this model is similar, but insted of three color channels you get 21 channels (heatmaps). One for each joint. Each heatmap is a 2d array of zeros (black pixels) except those where a joint has been recognized, those spots are ones (white ) -> hence the name heatmap.

Each layer/heatmap is a 2d array, which can be seen as a x and y coordinate system. What you would do, is first to find which value is the highest in the heatmap and afterwards find the x and y indexes of that value.

The above calculation is done right here: https://github.com/timctho/convolutional-pose-machines-tensorflow/blob/master/run_demo_hand_with_tracker.py#L298-L299

This image visualizes all 21 heatmaps as a single layer, but behind the scenes, they are in their own layer.
image

@luchen828
Copy link

I'm in the progress of converting the model to Tensorflow Lite, but I'm not very experienced with Tensorflow yet.

For the conversion I need to use the input and output tensor sizes. Where am I able to find those?

Will the input be the image size and color channels? Eg [None, FLAGS.input_size, FLAGS.input_size, 3] ?
And for output, would that be just the num_of_joints number?

To clarify my question, I'm using the second code snippet provided by Pannag Sanketi : https://stackoverflow.com/questions/50632152/tensorflow-convert-pb-file-to-tflite-using-python

hi,do you konw the numbers of data in the label txt? is the name of jpg + 4 coordinates of hand bbox + 21 coordinates of hand keypoint?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants