Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Opset13][FP8] Introduce FakeConvert op core #20930

Merged

Conversation

mitruska
Copy link
Contributor

@mitruska mitruska commented Nov 7, 2023

Details:

  • Introduction of FakeConvert op shell for FP8 types

  • Based on: POC by @andreyanufr


The following parts will be introduced as a separate PRs:

  • support for evaluate method and reference implementation
  • shape inference and input validation improvements
  • MO IR Reader
  • Python API
  • op specification

Tickets:

  • 19976

@github-actions github-actions bot added category: Core OpenVINO Core (aka ngraph) category: IE Tests OpenVINO Test: plugins and common category: CPP API OpenVINO CPP API bindings labels Nov 7, 2023
Comment on lines 18 to 22
FakeConvert(const ov::Output<ov::Node>& arg,
const ov::Output<ov::Node>& scale,
const ov::Output<ov::Node>& shift,
const std::string& destination_type = "F8E4M3",
bool apply_scale = true);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @andreyanufr @AlexKoff88, I wonder can we drop the apply_scale attribute in favour of having two constructors and then apply_scale if the scale and shift are provided, in other words if (inputs.size() == 3). Do you see a need to keep this attribute and switch off the scaling even if the inputs are provided?

Suggested change
FakeConvert(const ov::Output<ov::Node>& arg,
const ov::Output<ov::Node>& scale,
const ov::Output<ov::Node>& shift,
const std::string& destination_type = "F8E4M3",
bool apply_scale = true);
FakeConvert(const ov::Output<ov::Node>& arg,
const std::string& destination_type = "F8E4M3");
FakeConvert(const ov::Output<ov::Node>& arg,
const ov::Output<ov::Node>& scale,
const ov::Output<ov::Node>& shift,
const std::string& destination_type = "F8E4M3");

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apply_scale was added for research purposes only. This should not be in the current implementation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andreyanufr, do we really need apply_scale at all?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AlexKoff88 , we don't need it. I think we will put scale and shift outside of this operation in the network graph.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed it's better to keep scaling within the FakeConvert logic, in the updated version scale input is required and shift input optional.
The apply_scale attribute has been removed.


private:
void validate() const;
std::string m_destination_type = "F8E4M3";
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the FP8 types will be a part of the ov::element, this destination type should be ov::element, for now it can be string as proposed in the POC (or improved to be struct/enum).

src/core/src/op/fake_convert.cpp Outdated Show resolved Hide resolved
@mitruska mitruska marked this pull request as ready for review November 7, 2023 15:36
@mitruska mitruska requested review from a team as code owners November 7, 2023 15:36
@AlexKoff88 AlexKoff88 requested a review from slyalin November 7, 2023 15:49
FakeConvert(const ov::Output<ov::Node>& arg,
const ov::Output<ov::Node>& scale,
const ov::Output<ov::Node>& shift,
const std::string& destination_type = "F8E4M3",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we change F8E4M3 to f8e4m3 per our agreement? Is it consistent with the names of other data types?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will update it to lowercase, in case of keeping it as string, I thought that both options should be allowed (upper and lower case) and unified with std::tolower. We can stick with lower case only.
Currently for ov::element upper and lower case are acceptable:

if (type == "f16" || type == "FP16") {
return ::ov::element::Type(::ov::element::Type_t::f16);
} else if (type == "f32" || type == "FP32") {

Is it okay to keep the destination_type as string (at least for now, and update it to ov::element later) or it's better to introduce temporary enum representation for FP8 types attribute?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is ok.

@mitruska mitruska self-assigned this Nov 8, 2023
@mitruska mitruska added this to the 2023.3 milestone Nov 8, 2023
@mitruska mitruska added the category: Opset OpenVINO Opset label Nov 8, 2023
Comment on lines 18 to 22
FakeConvert(const ov::Output<ov::Node>& arg,
const ov::Output<ov::Node>& scale,
const ov::Output<ov::Node>& shift,
const std::string& destination_type = "F8E4M3",
bool apply_scale = true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apply_scale was added for research purposes only. This should not be in the current implementation.

Comment on lines 18 to 22
FakeConvert(const ov::Output<ov::Node>& arg,
const ov::Output<ov::Node>& scale,
const ov::Output<ov::Node>& shift,
const std::string& destination_type = "F8E4M3",
bool apply_scale = true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AlexKoff88 , we don't need it. I think we will put scale and shift outside of this operation in the network graph.

@AlexKoff88
Copy link
Contributor

@AlexKoff88 , we don't need it. I think we will put scale and shift outside of this operation in the network graph.

I don't think we should keep scale and shift outside of operation. Otherwise, it is gonna be long subgraphs again, such as Multiply(s)->Subtract(zp)->FakeConvert->Add(zp)->Multiply(1/s) for each operation and its weights. This will blow off the model significantly and lead to much larger representation, longer transformations, more complicated model processing, etc. It can hurt LLMs.


std::shared_ptr<ov::Node> FakeConvert::clone_with_new_inputs(const ov::OutputVector& new_args) const {
OV_OP_SCOPE(v13_FakeConvert_clone_with_new_inputs);
OPENVINO_ASSERT(new_args.size() == 3, "Incorrect number of new arguments");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As scale and shift are not mandatory, please make them optional. Otherwise I don't understand why we have apply_scale, but don't have apply_shift.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you make both scale and shift optional, you really need to have only apply_scale among attributes. Otherwise apply_shift is required.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make the shift an optional parameter. I would make scale as required.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we always need to apply scale, but shift can be optional.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, in the updated version scale input is required and shift input optional.
The apply_scale attribute has been removed.

const ov::Output<ov::Node>& scale,
const ov::Output<ov::Node>& shift,
std::string destination_type = "f8e4m3",
bool apply_scale = false);
Copy link
Contributor

@slyalin slyalin Nov 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No reason to have apply_scale set to false by default, suggesting to set it automatically in case if scale input is provided (and sale and shift inputs are made optional as suggested in another comment). If one wants to provide shift but doesn't apply scale, then as we don't have "gaps" in the list of operation arguments, one should provide arbitrary scale (will be ignored), then shift and set apply_scale to false. This is only (and looks like very rare situation) when apply_scale should have manual control as a parameter of constructor.

Copy link
Contributor

@slyalin slyalin Nov 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FakeConvert(input) ==> apply_scale = false (as no scale provided)
FakeConvert(input, scale) ==> apply_scale = true
FakeConvert(input, scale, shift) ==> apply_scale = true
FakeConvert(input, scale, shift, "f8e4m", false) ==> apply_scale = false, scale input is ignored

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's why I started the conversation above: #20930 (comment)
So the final suggestion is to have three constructors and the apply_scale attribute.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, in the updated version scale input is required and shift input optional.
The apply_scale attribute has been removed.

Copy link
Contributor

@slyalin slyalin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary: change scale and shift to optional, rework ctor signature to be more practical.

@mitruska
Copy link
Contributor Author

@slyalin, @AlexKoff88, @andreyanufr As discussed offline, removed apply_scale attribute, keeping scale as required input and shift as optional.
Please re-review.

Comment on lines +49 to +51
return std::make_shared<FakeConvert>(new_args.at(0), new_args.at(1), m_destination_type);
} else if (new_args.size() == 3) {
return std::make_shared<FakeConvert>(new_args.at(0), new_args.at(1), new_args.at(2), m_destination_type);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mitruska, let me put it here but it is not a blocker for this PR, just a general comment for Op enabling that can be made simpler: clone_with_new_inputs can be generated automatically with using default ctor that is available for all operations, and the visitors. It is applicable for all the operations and utilizing such a mechanism will simplify the process of new ops enabling. I've triggered on this condition that just a consequence of having two ctors with a different number of arguments. Various ctors are provided for externals op usage for convenience in C++, but for such a standard procedure as cloning, it makes sense to use more low-level approach. Compare with serialization/deserialization: they are implemented automatically using the same low-level apparatus.

@mitruska mitruska enabled auto-merge (squash) November 14, 2023 19:54
@mitruska mitruska merged commit afc5995 into openvinotoolkit:master Nov 14, 2023
35 checks passed
private:
void validate_type() const;

std::string m_destination_type = "f8e4m3";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this string? Should be ov::element::Type. See Convert class for reference -- no reason to have an alternative way to describe the type.
Sorry, missed this when reviewing because didn't really expect a trap here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

byungilm pushed a commit to byungilm/openvino that referenced this pull request Nov 17, 2023
* FakeConvert op init

* Update dest types names

* Update op hpp

* Update opset ops number

* Init type_prop tests

* Add attributes tests

* Add op check test

* Update namespace in fc cpp

* Update getters

* Refactor static member

* Make destination_type lower case

* Update type in test

* Move get_valid_types out of class

* Update ops number in opset

* Remove apply_scale attribute

* Additional constructor to make `shift` input optional
allnes pushed a commit to allnes/openvino that referenced this pull request Nov 23, 2023
* FakeConvert op init

* Update dest types names

* Update op hpp

* Update opset ops number

* Init type_prop tests

* Add attributes tests

* Add op check test

* Update namespace in fc cpp

* Update getters

* Refactor static member

* Make destination_type lower case

* Update type in test

* Move get_valid_types out of class

* Update ops number in opset

* Remove apply_scale attribute

* Additional constructor to make `shift` input optional
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: Core OpenVINO Core (aka ngraph) category: CPP API OpenVINO CPP API bindings category: IE Tests OpenVINO Test: plugins and common category: Opset OpenVINO Opset
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants