-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Opset13][FP8] Introduce FakeConvert op core #20930
[Opset13][FP8] Introduce FakeConvert op core #20930
Conversation
FakeConvert(const ov::Output<ov::Node>& arg, | ||
const ov::Output<ov::Node>& scale, | ||
const ov::Output<ov::Node>& shift, | ||
const std::string& destination_type = "F8E4M3", | ||
bool apply_scale = true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @andreyanufr @AlexKoff88, I wonder can we drop the apply_scale
attribute in favour of having two constructors and then apply_scale if the scale
and shift
are provided, in other words if (inputs.size() == 3)
. Do you see a need to keep this attribute and switch off the scaling even if the inputs are provided?
FakeConvert(const ov::Output<ov::Node>& arg, | |
const ov::Output<ov::Node>& scale, | |
const ov::Output<ov::Node>& shift, | |
const std::string& destination_type = "F8E4M3", | |
bool apply_scale = true); | |
FakeConvert(const ov::Output<ov::Node>& arg, | |
const std::string& destination_type = "F8E4M3"); | |
FakeConvert(const ov::Output<ov::Node>& arg, | |
const ov::Output<ov::Node>& scale, | |
const ov::Output<ov::Node>& shift, | |
const std::string& destination_type = "F8E4M3"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
apply_scale was added for research purposes only. This should not be in the current implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andreyanufr, do we really need apply_scale
at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AlexKoff88 , we don't need it. I think we will put scale and shift outside of this operation in the network graph.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed it's better to keep scaling within the FakeConvert logic, in the updated version scale
input is required and shift
input optional.
The apply_scale
attribute has been removed.
|
||
private: | ||
void validate() const; | ||
std::string m_destination_type = "F8E4M3"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When the FP8 types will be a part of the ov::element
, this destination type should be ov::element
, for now it can be string as proposed in the POC (or improved to be struct/enum).
FakeConvert(const ov::Output<ov::Node>& arg, | ||
const ov::Output<ov::Node>& scale, | ||
const ov::Output<ov::Node>& shift, | ||
const std::string& destination_type = "F8E4M3", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we change F8E4M3
to f8e4m3
per our agreement? Is it consistent with the names of other data types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will update it to lowercase, in case of keeping it as string, I thought that both options should be allowed (upper and lower case) and unified with std::tolower
. We can stick with lower case only.
Currently for ov::element
upper and lower case are acceptable:
openvino/src/core/src/type/element_type.cpp
Lines 80 to 82 in d07f272
if (type == "f16" || type == "FP16") { | |
return ::ov::element::Type(::ov::element::Type_t::f16); | |
} else if (type == "f32" || type == "FP32") { |
Is it okay to keep the destination_type
as string (at least for now, and update it to ov::element
later) or it's better to introduce temporary enum representation for FP8 types attribute?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is ok.
FakeConvert(const ov::Output<ov::Node>& arg, | ||
const ov::Output<ov::Node>& scale, | ||
const ov::Output<ov::Node>& shift, | ||
const std::string& destination_type = "F8E4M3", | ||
bool apply_scale = true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
apply_scale was added for research purposes only. This should not be in the current implementation.
FakeConvert(const ov::Output<ov::Node>& arg, | ||
const ov::Output<ov::Node>& scale, | ||
const ov::Output<ov::Node>& shift, | ||
const std::string& destination_type = "F8E4M3", | ||
bool apply_scale = true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AlexKoff88 , we don't need it. I think we will put scale and shift outside of this operation in the network graph.
I don't think we should keep scale and shift outside of operation. Otherwise, it is gonna be long subgraphs again, such as Multiply(s)->Subtract(zp)->FakeConvert->Add(zp)->Multiply(1/s) for each operation and its weights. This will blow off the model significantly and lead to much larger representation, longer transformations, more complicated model processing, etc. It can hurt LLMs. |
src/core/src/op/fake_convert.cpp
Outdated
|
||
std::shared_ptr<ov::Node> FakeConvert::clone_with_new_inputs(const ov::OutputVector& new_args) const { | ||
OV_OP_SCOPE(v13_FakeConvert_clone_with_new_inputs); | ||
OPENVINO_ASSERT(new_args.size() == 3, "Incorrect number of new arguments"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As scale and shift are not mandatory, please make them optional. Otherwise I don't understand why we have apply_scale
, but don't have apply_shift
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you make both scale
and shift
optional, you really need to have only apply_scale
among attributes. Otherwise apply_shift
is required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make the shift an optional parameter. I would make scale as required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we always need to apply scale, but shift can be optional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed, in the updated version scale
input is required and shift
input optional.
The apply_scale
attribute has been removed.
const ov::Output<ov::Node>& scale, | ||
const ov::Output<ov::Node>& shift, | ||
std::string destination_type = "f8e4m3", | ||
bool apply_scale = false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No reason to have apply_scale
set to false by default, suggesting to set it automatically in case if scale input is provided (and sale and shift inputs are made optional as suggested in another comment). If one wants to provide shift but doesn't apply scale, then as we don't have "gaps" in the list of operation arguments, one should provide arbitrary scale (will be ignored), then shift and set apply_scale to false. This is only (and looks like very rare situation) when apply_scale should have manual control as a parameter of constructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FakeConvert(input) ==> apply_scale = false (as no scale provided)
FakeConvert(input, scale) ==> apply_scale = true
FakeConvert(input, scale, shift) ==> apply_scale = true
FakeConvert(input, scale, shift, "f8e4m", false) ==> apply_scale = false, scale input is ignored
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's why I started the conversation above: #20930 (comment)
So the final suggestion is to have three constructors and the apply_scale
attribute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed, in the updated version scale
input is required and shift
input optional.
The apply_scale
attribute has been removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary: change scale and shift to optional, rework ctor signature to be more practical.
@slyalin, @AlexKoff88, @andreyanufr As discussed offline, removed |
return std::make_shared<FakeConvert>(new_args.at(0), new_args.at(1), m_destination_type); | ||
} else if (new_args.size() == 3) { | ||
return std::make_shared<FakeConvert>(new_args.at(0), new_args.at(1), new_args.at(2), m_destination_type); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mitruska, let me put it here but it is not a blocker for this PR, just a general comment for Op enabling that can be made simpler: clone_with_new_inputs
can be generated automatically with using default ctor that is available for all operations, and the visitors. It is applicable for all the operations and utilizing such a mechanism will simplify the process of new ops enabling. I've triggered on this condition that just a consequence of having two ctors with a different number of arguments. Various ctors are provided for externals op usage for convenience in C++, but for such a standard procedure as cloning, it makes sense to use more low-level approach. Compare with serialization/deserialization: they are implemented automatically using the same low-level apparatus.
private: | ||
void validate_type() const; | ||
|
||
std::string m_destination_type = "f8e4m3"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this string? Should be ov::element::Type. See Convert
class for reference -- no reason to have an alternative way to describe the type.
Sorry, missed this when reviewing because didn't really expect a trap here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requires adding new types here: https://github.com/openvinotoolkit/openvino/blob/master/src/core/include/openvino/core/type/element_type.hpp#L35-L55
* FakeConvert op init * Update dest types names * Update op hpp * Update opset ops number * Init type_prop tests * Add attributes tests * Add op check test * Update namespace in fc cpp * Update getters * Refactor static member * Make destination_type lower case * Update type in test * Move get_valid_types out of class * Update ops number in opset * Remove apply_scale attribute * Additional constructor to make `shift` input optional
* FakeConvert op init * Update dest types names * Update op hpp * Update opset ops number * Init type_prop tests * Add attributes tests * Add op check test * Update namespace in fc cpp * Update getters * Refactor static member * Make destination_type lower case * Update type in test * Move get_valid_types out of class * Update ops number in opset * Remove apply_scale attribute * Additional constructor to make `shift` input optional
Details:
Introduction of
FakeConvert
op shell for FP8 typesBased on: POC by @andreyanufr
The following parts will be introduced as a separate PRs:
Tickets: