-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add iou similarity operator #7566
Conversation
auto x_dims = ctx->GetInputDim("X"); | ||
auto y_dims = ctx->GetInputDim("Y"); | ||
|
||
PADDLE_ENFORCE_EQ(x_dims.size(), 2UL, "The shape of X is [N, 4]"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rank of Input(X) must be 2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
using framework::OperatorWithKernel::OperatorWithKernel; | ||
|
||
protected: | ||
void InferShape(framework::InferShapeContext *ctx) const override { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PADDLE_ENFORCE(ctx->HasInput("X"),
"Input(X) of IOUSimilarityOp should not be null.");
PADDLE_ENFORCE(ctx->HasInput("Y"),
"Input(Y) of IOUSimilarityOp should not be null.");
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
PADDLE_ENFORCE_EQ(x_dims.size(), 2UL, "The shape of X is [N, 4]"); | ||
PADDLE_ENFORCE_EQ(x_dims[1], 4UL, "The shape of X is [N, 4]"); | ||
PADDLE_ENFORCE_EQ(y_dims.size(), 2UL, "The shape of Y is [M, 4]"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rank of Input(Y) must be 2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
AddInput( | ||
"X", | ||
"(Tensor, default Tensor<float>) " | ||
"BoxList X holding N boxes, each box is " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Box list X holds ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
"X", | ||
"(Tensor, default Tensor<float>) " | ||
"BoxList X holding N boxes, each box is " | ||
"represented as [xmin, ymin, xmax, ymax], the shape of X is [N, 4]."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to explain the meaning of xmin, ymin, xmax, ymax
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
AddComment(R"DOC( | ||
IOU Similarity Operator. | ||
Computes pairwise intersection-over-union between box collections. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
intersection-over-union (IOU) between two box lists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
platform::ForRange<DeviceContext> for_range( | ||
static_cast<const DeviceContext&>(ctx.device_context()), x_n); | ||
for_range(functor); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
platform::ForRange
support GPU, please register GPU kernel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
self.check_output() | ||
|
||
def test_check_grad(self): | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove test_check_grad
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
def setUp(self): | ||
self.op_type = "iou_similarity" | ||
self.set_data() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If only one test, the code in set_data()
and init_test_data()
can be moved here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
[0.0, 0.0, 20.0, 20.0]]).astype('float32') | ||
self.output = np.array( | ||
[[2.0 / 16.0, 0, 6.0 / 400.0], | ||
[1.0 / 16.0, 0.0, 5.0 / 400.0]]).astype('float32') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to use random data and calculation the IOU in Python.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better to calculate in Python, but this version uses data to verify the run first.
See the License for the specific language governing permissions and | ||
limitations under the License. */ | ||
|
||
#define EIGEN_USE_GPU |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
PADDLE_ENFORCE_EQ(y_dims.size(), 2UL, "The rank of Input(Y) must be 2."); | ||
PADDLE_ENFORCE_EQ(y_dims[1], 4UL, "The shape of Y is [M, 4]"); | ||
|
||
ctx->SetOutputDim("Out", framework::make_ddim({x_dims[0], y_dims[0]})); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please consider 'X' as a LoDTensor. Here, LoD of 'out' should inherit from 'X'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add ctx->ShareLoD("X", /*->*/ "Out");
in the InferShape
like : https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/operators/mul_op.cc#L68
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
IOUSimilarityOpMaker(OpProto *proto, OpAttrChecker *op_checker) | ||
: OpProtoAndCheckerMaker(proto, op_checker) { | ||
AddInput("X", | ||
"(Tensor, default Tensor<float>) " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
X should be a LoDTensor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
AddComment(R"DOC( | ||
IOU Similarity Operator. | ||
Computes intersection-over-union (IOU) between two box lists. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The document is too simple. Please explain the function more clearly. 'X' should be a LoDTensor and 'Y' is a common Tensor, boxes in 'Y' are shared by all input images.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, added the formula
T inter_xmax = xmax1 > xmax2 ? xmax2 : xmax1; | ||
T inter_ymax = ymax1 > ymax2 ? ymax2 : ymax1; | ||
T inter_xmin = xmin1 > xmin2 ? xmin1 : xmin2; | ||
T inter_ymin = ymin1 > ymin2 ? ymin1 : ymin2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use 'min' and 'max' to make the code more readable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Std:: min can't run under GPU.
PADDLE_ENFORCE_EQ(y_dims.size(), 2UL, "The rank of Input(Y) must be 2."); | ||
PADDLE_ENFORCE_EQ(y_dims[1], 4UL, "The shape of Y is [M, 4]"); | ||
|
||
ctx->SetOutputDim("Out", framework::make_ddim({x_dims[0], y_dims[0]})); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add ctx->ShareLoD("X", /*->*/ "Out");
in the InferShape
like : https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/operators/mul_op.cc#L68
"[xmax, ymax] is the right upper coordinate of the box."); | ||
|
||
AddOutput("Out", | ||
"(LoDTensor or Tensor, the lod is same as input X) The output of " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LoDTensor or Tensor -> LoDTensor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
"(LoDTensor, default LoDTensor<float>) " | ||
"Box list X is a 2-D LoDTensor with shape [N, 4] holds N boxes, " | ||
"each box is represented as [xmin, ymin, xmax, ymax], " | ||
"the shape of X is [N, 4]. [xmin, ymin] is the lower left " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[xmin, ymin] is the left top coordinate of the box if the input is image feature map. They are close to the origin of the coordinate system.
Modify other places too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
IOU Similarity Operator. | ||
Computes intersection-over-union (IOU) between two box lists. | ||
Box list 'X' should be a LoDTensor and 'Y' is a common Tensor, | ||
boxes in 'Y' are shared by all input images. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by all instance of the batched inputs of X.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Computes intersection-over-union (IOU) between two box lists. | ||
Box list 'X' should be a LoDTensor and 'Y' is a common Tensor, | ||
boxes in 'Y' are shared by all input images. | ||
Given two box A and B, the calculation of IOU is as follows: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given two boxes of A and B,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
AddInput("Y", | ||
"(Tensor, default Tensor<float>) " | ||
"Box list Y holds M boxes, each box is represented as " | ||
"[xmin, ymin, xmax, ymax], the shape of X is [N, 4]. " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the shape of X is [N, 4] -> the shape of X is [M, 4]
T y_min1 = x_[row_id * 4 + 1]; | ||
T x_max1 = x_[row_id * 4 + 2]; | ||
T y_max1 = x_[row_id * 4 + 3]; | ||
for (size_t i = 0; i < cols_; ++i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, cols_ is the number of prior_box, in the SSD this number is about 8732 or more, so, this is less efficient on GPU. This will be fixed later.
resolve #7565