Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add iou similarity operator #7566

Merged
merged 9 commits into from
Jan 23, 2018
Merged

add iou similarity operator #7566

merged 9 commits into from
Jan 23, 2018

Conversation

wanghaox
Copy link
Contributor

resolve #7565

@wanghaox wanghaox requested review from pkuyym and qingqing01 January 16, 2018 10:46
auto x_dims = ctx->GetInputDim("X");
auto y_dims = ctx->GetInputDim("Y");

PADDLE_ENFORCE_EQ(x_dims.size(), 2UL, "The shape of X is [N, 4]");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rank of Input(X) must be 2.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

using framework::OperatorWithKernel::OperatorWithKernel;

protected:
void InferShape(framework::InferShapeContext *ctx) const override {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

 PADDLE_ENFORCE(ctx->HasInput("X"),
                   "Input(X) of IOUSimilarityOp should not be null.");
 PADDLE_ENFORCE(ctx->HasInput("Y"),
                   "Input(Y) of IOUSimilarityOp should not be null.");

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


PADDLE_ENFORCE_EQ(x_dims.size(), 2UL, "The shape of X is [N, 4]");
PADDLE_ENFORCE_EQ(x_dims[1], 4UL, "The shape of X is [N, 4]");
PADDLE_ENFORCE_EQ(y_dims.size(), 2UL, "The shape of Y is [M, 4]");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rank of Input(Y) must be 2.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

AddInput(
"X",
"(Tensor, default Tensor<float>) "
"BoxList X holding N boxes, each box is "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Box list X holds ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

"X",
"(Tensor, default Tensor<float>) "
"BoxList X holding N boxes, each box is "
"represented as [xmin, ymin, xmax, ymax], the shape of X is [N, 4].");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to explain the meaning of xmin, ymin, xmax, ymax

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


AddComment(R"DOC(
IOU Similarity Operator.
Computes pairwise intersection-over-union between box collections.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

intersection-over-union (IOU) between two box lists.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


platform::ForRange<DeviceContext> for_range(
static_cast<const DeviceContext&>(ctx.device_context()), x_n);
for_range(functor);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

platform::ForRange support GPU, please register GPU kernel.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

self.check_output()

def test_check_grad(self):
return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove test_check_grad

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


def setUp(self):
self.op_type = "iou_similarity"
self.set_data()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If only one test, the code in set_data() and init_test_data() can be moved here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

[0.0, 0.0, 20.0, 20.0]]).astype('float32')
self.output = np.array(
[[2.0 / 16.0, 0, 6.0 / 400.0],
[1.0 / 16.0, 0.0, 5.0 / 400.0]]).astype('float32')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to use random data and calculation the IOU in Python.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to calculate in Python, but this version uses data to verify the run first.

See the License for the specific language governing permissions and
limitations under the License. */

#define EIGEN_USE_GPU
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

PADDLE_ENFORCE_EQ(y_dims.size(), 2UL, "The rank of Input(Y) must be 2.");
PADDLE_ENFORCE_EQ(y_dims[1], 4UL, "The shape of Y is [M, 4]");

ctx->SetOutputDim("Out", framework::make_ddim({x_dims[0], y_dims[0]}));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please consider 'X' as a LoDTensor. Here, LoD of 'out' should inherit from 'X'.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add ctx->ShareLoD("X", /*->*/ "Out"); in the InferShape like : https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/operators/mul_op.cc#L68

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

IOUSimilarityOpMaker(OpProto *proto, OpAttrChecker *op_checker)
: OpProtoAndCheckerMaker(proto, op_checker) {
AddInput("X",
"(Tensor, default Tensor<float>) "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

X should be a LoDTensor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


AddComment(R"DOC(
IOU Similarity Operator.
Computes intersection-over-union (IOU) between two box lists.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The document is too simple. Please explain the function more clearly. 'X' should be a LoDTensor and 'Y' is a common Tensor, boxes in 'Y' are shared by all input images.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, added the formula

T inter_xmax = xmax1 > xmax2 ? xmax2 : xmax1;
T inter_ymax = ymax1 > ymax2 ? ymax2 : ymax1;
T inter_xmin = xmin1 > xmin2 ? xmin1 : xmin2;
T inter_ymin = ymin1 > ymin2 ? ymin1 : ymin2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use 'min' and 'max' to make the code more readable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Std:: min can't run under GPU.

PADDLE_ENFORCE_EQ(y_dims.size(), 2UL, "The rank of Input(Y) must be 2.");
PADDLE_ENFORCE_EQ(y_dims[1], 4UL, "The shape of Y is [M, 4]");

ctx->SetOutputDim("Out", framework::make_ddim({x_dims[0], y_dims[0]}));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add ctx->ShareLoD("X", /*->*/ "Out"); in the InferShape like : https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/operators/mul_op.cc#L68

"[xmax, ymax] is the right upper coordinate of the box.");

AddOutput("Out",
"(LoDTensor or Tensor, the lod is same as input X) The output of "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LoDTensor or Tensor -> LoDTensor

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

"(LoDTensor, default LoDTensor<float>) "
"Box list X is a 2-D LoDTensor with shape [N, 4] holds N boxes, "
"each box is represented as [xmin, ymin, xmax, ymax], "
"the shape of X is [N, 4]. [xmin, ymin] is the lower left "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[xmin, ymin] is the left top coordinate of the box if the input is image feature map. They are close to the origin of the coordinate system.

Modify other places too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

IOU Similarity Operator.
Computes intersection-over-union (IOU) between two box lists.
Box list 'X' should be a LoDTensor and 'Y' is a common Tensor,
boxes in 'Y' are shared by all input images.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by all instance of the batched inputs of X.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Computes intersection-over-union (IOU) between two box lists.
Box list 'X' should be a LoDTensor and 'Y' is a common Tensor,
boxes in 'Y' are shared by all input images.
Given two box A and B, the calculation of IOU is as follows:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given two boxes of A and B,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

AddInput("Y",
"(Tensor, default Tensor<float>) "
"Box list Y holds M boxes, each box is represented as "
"[xmin, ymin, xmax, ymax], the shape of X is [N, 4]. "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the shape of X is [N, 4] -> the shape of X is [M, 4]

T y_min1 = x_[row_id * 4 + 1];
T x_max1 = x_[row_id * 4 + 2];
T y_max1 = x_[row_id * 4 + 3];
for (size_t i = 0; i < cols_; ++i) {
Copy link
Contributor

@qingqing01 qingqing01 Jan 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, cols_ is the number of prior_box, in the SSD this number is about 8732 or more, so, this is less efficient on GPU. This will be fixed later.

@wanghaox wanghaox merged commit d43932c into PaddlePaddle:develop Jan 23, 2018
@wanghaox wanghaox deleted the iou_sim branch January 23, 2018 11:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add IoU similarity operator.
3 participants