-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add specification for SegmentMax-16
#28103
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: p-wysocki <[email protected]>
Signed-off-by: p-wysocki <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let us have EmbeddingSegmentsMax
similar to EmbeddingSegmentsSum
.
It should also have default index (defining default value for empty segment)
* Segment_4: ``[]`` | ||
* Segment_5: ``[data[6], data[7]]`` | ||
|
||
When there are no values in a segment, ``output[segment]`` is set to 0. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should have default value for empty segments, otherwise, we will have additional computation graph (that is not trivial) to compute empty segments and replace zero value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default value seems to be 0, according to https://www.tensorflow.org/api_docs/python/tf/raw_ops/SegmentMax. I don't think we should expand the op on our own, especially since we only expect it to come from TF FE.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is also V2 https://www.tensorflow.org/api_docs/python/tf/raw_ops/SegmentMaxV2 where the default has been changed to numeric_limits<T>::lowest()
. Adding attribute for default value seems to be a simple solution to support both cases, but to enable V2 at once we would also need to consider "num_segments" input.
|
||
* **1**: *data* | ||
|
||
* **Description**: The numerical data on which SegmentMax operation will be performed. **Required.** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please define input shapes and output shape for each input and describe what dimensions are equal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
data
can have any rank and dimensions, so it's described as ND of any numerical type. segment_ids
are specified to be a 1D tensor of non-negative, sorted integer numbers of size equal to the size of the first dimension of the input tensor.
Could you please specify what's missing? I think the shapes are covered, but I may be missing something.
|
Signed-off-by: p-wysocki <[email protected]>
|
||
**Outputs** | ||
|
||
* **1**: The output tensor of type *T* and the same shape as the ``input`` tensor with the exception for the first dimension, which is equal to the count of unique segment IDs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* **1**: The output tensor of type *T* and the same shape as the ``input`` tensor with the exception for the first dimension, which is equal to the count of unique segment IDs. | |
* **1**: The output tensor of type *T* and almost the same shape as the ``data`` input tensor with the exception for the first dimension, which is equal to the count of unique segment IDs (calculated as ``max(segment_ids) + 1``). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe instead almost
use,
The output tensor has same rank and dimensions as the ``data`` input tensor except first dimension which is calculated as ``max(segment_ids) + 1``
?
* Segment_4: ``[]`` | ||
* Segment_5: ``[data[6], data[7]]`` | ||
|
||
When there are no values in a segment, ``output[segment]`` is set to 0. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is also V2 https://www.tensorflow.org/api_docs/python/tf/raw_ops/SegmentMaxV2 where the default has been changed to numeric_limits<T>::lowest()
. Adding attribute for default value seems to be a simple solution to support both cases, but to enable V2 at once we would also need to consider "num_segments" input.
* **2**: *segment_ids* | ||
|
||
* **Description**: Controls how the data is divided into segments. **Required.** | ||
* **Range of values**: 1D tensor of non-negative, sorted integer numbers. Its size is equal to the size of the first dimension of the input tensor. | ||
* **Type**: *T_IDX* | ||
|
||
**Outputs** | ||
|
||
* **1**: The output tensor of type *T* and the same shape as the ``input`` tensor with the exception for the first dimension, which is equal to the count of unique segment IDs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The style of the "Inputs" section description follows rather the "Attributes" style.
Consider alignment with other spec documents.
* **2**: *segment_ids* | ||
|
||
* **Description**: Controls how the data is divided into segments. **Required.** | ||
* **Range of values**: 1D tensor of non-negative, sorted integer numbers. Its size is equal to the size of the first dimension of the input tensor. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see that unsorted segment_ids
may lead to error or undefined behavior (implementation specific, depends on the hardware).
Should we specify a common behavior for OV op?
Can be clarified at the plugin implementation stage.
* **2**: *segment_ids* | ||
|
||
* **Description**: Controls how the data is divided into segments. **Required.** | ||
* **Range of values**: 1D tensor of non-negative, sorted integer numbers. Its size is equal to the size of the first dimension of the input tensor. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* **Range of values**: 1D tensor of non-negative, sorted integer numbers. Its size is equal to the size of the first dimension of the input tensor. | |
* **Range of values**: 1D tensor of non-negative, sorted integer numbers. Its size is equal to the size of the first dimension of the ``data`` input tensor. |
SegmentMax | ||
=================== |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SegmentMax | |
=================== | |
SegmentMax | |
========== |
Should number =
be same as heading
length?
|
||
**Outputs** | ||
|
||
* **1**: The output tensor of type *T* and the same shape as the ``input`` tensor with the exception for the first dimension, which is equal to the count of unique segment IDs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe instead almost
use,
The output tensor has same rank and dimensions as the ``data`` input tensor except first dimension which is calculated as ``max(segment_ids) + 1``
?
Details:
tf.math.segment_max
(https://www.tensorflow.org/api_docs/python/tf/math/segment_max)Tickets: