Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug (?) when normalising coordinates #8

Closed
DrSleep opened this issue Oct 17, 2016 · 2 comments
Closed

Bug (?) when normalising coordinates #8

DrSleep opened this issue Oct 17, 2016 · 2 comments

Comments

@DrSleep
Copy link

DrSleep commented Oct 17, 2016

Hi, @daerduoCarey

I have a question on the part of the code that deals with normalising the coordinates:
st_layer.cpp#L114-119

    Dtype* data = output_grid.mutable_cpu_data();
    for(int i=0; i<output_H_ * output_W_; ++i) {
        data[3 * i] = (i / output_W_) * 1.0 / output_H_ * 2 - 1;
        data[3 * i + 1] = (i % output_W_) * 1.0 / output_W_ * 2 - 1;
        data[3 * i + 2] = 1;
    }

If I have understood the paper correctly, the normalised coordinates should lie in [-1, 1]. In the code above, the upper bound is less than 1.

For example, for output_H_ = 2, output_W_ = 3 the result is as follows:

i: 0 [-1, -1, 1]
i: 1 [-1, -0.333333, 1]
i: 2 [-1, 0.333333, 1]
i: 3 [0, -1, 1]
i: 4 [0, -0.333333, 1]
i: 5 [0, 0.333333, 1]

So, shouldn't it be something like this instead?

    Dtype* data = output_grid.mutable_cpu_data();
    for(int i=0; i<output_H_ * output_W_; ++i) {
        data[3 * i] = (i / output_W_) * 1.0 / (output_H_ - 1) * 2 - 1;
        data[3 * i + 1] = (i % output_W_) * 1.0 / (output_W_ - 1) * 2 - 1;
        data[3 * i + 2] = 1;
    }

Which generates the following normalised coordinates:

i: 0 [-1, -1, 1]
i: 1 [-1, 0, 1]
i: 2 [-1, 1, 1]
i: 3 [1, -1, 1]
i: 4 [1, 0, 1]
i: 5 [1, 1, 1]

Thanks.

@daerduoCarey
Copy link
Owner

Hi, @DrSleep ,

Thank you for your interests in my implementation and your careful examination of my code.

I think this is a quantization issue. We have to find some way to discretize the output image space into some grids and compute the values via interpolation for the grids. I think the most reasonable implementation is to add grid_size/2 to all of my computed grid coordinates before applying the transformation matrix to them. But I think when output_W_ and output_H_ are large enough (maybe 64 is enough, not like your example, 2 and 3), the problem should not be so dramatic. Let's say that there is indeed some difference (fix input image and transformation matrix, the output images may be slightly different) using different quantization approaches. However, we are still safe to use any of them since the learning process of the transformation matrix in the prior mini-network should be aware of this issue and do some adjustment to produce a slightly different transformation matrix to offset the quantization issue.

Hope my response help you understand the issue. More questions are warmly welcomed!

Thank you.

Bests,
Kaichun Mo

@DrSleep
Copy link
Author

DrSleep commented Nov 23, 2016

Yes, I agree with your points: for big images this should not pose a problem.

Thanks for your comment!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants