-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TFLite] Custom attribute reading and While operation support #17932
[TFLite] Custom attribute reading and While operation support #17932
Conversation
const auto& decoder = get_decoder(node); | ||
int32_t cond_idx = decoder->get_attribute(&tflite::WhileOptions::cond_subgraph_index); | ||
int32_t body_idx = decoder->get_attribute(&tflite::WhileOptions::body_subgraph_index); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the code is quite similar to TF While translator. Can we create translate_session
class for storing cached previously converted body graphs? I think it makes sense to do because in this case we can use Frontend
class inplementation for convert
method that contain logic for telemetry.
I am leaning to align this code with TF implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have considered it. However, upon the closer look into the TF FE translation session I have encountered lots of logic which is TF specific -- there is no need for it from TFL FE perspective. Remaining code is quite simple. This is why I kept translation logic in InputModel and overloaded public frontend::NodeContext API that we have right now.
With all that frontend::tensorflow_lite::NodeContext
is able to trigger translation on demand.
With regards to the alignment -- It would be great to pull all the similar code into common space. Perhaps passing condition / body as ov::Model
would be a good start. None the less, there are several places where TF implementation has shape / type setting for better TensorList and DT_VARIANT support. These parts are not needed for TFL.
I had an attempt to unify this code, but it broke into several pieces which made it less elegant. So, I left them separate for now. I believe we will return to this unification question when unification along all the frontends would come. But for now I don't see the urgency in it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, there are some TF specific stuff in TranslateSession
but you can easily skip them and design the common TranslateSession
class and inherit TFL TranslateSession
class with your specific methods.
My point is to have the common TranslateSession with common get_converted_model
in the future. Now you can overload this function if we don't have FW node support in TFL FE.
Now if we decline use of TranlateSession
, we create additional gap from moving to the common Frontend where we can place telemetry stuff, for example.
src/frontends/tensorflow_lite/include/openvino/frontend/tensorflow_lite/node_context.hpp
Outdated
Show resolved
Hide resolved
|
||
const tflite::Operator* m_node_def; | ||
std::string m_type, m_name; | ||
std::map<size_t, ov::frontend::tensorflow_lite::TensorInfo> m_input_info, m_output_info; | ||
}; | ||
|
||
class DecoderFlatBufferTensors : public DecoderFlatBuffer { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment about why we need this class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could handle this in the next PR.
auto subgraphs_as_input_models = model_lite->get_subgraphs(); | ||
auto input_to_ov_model = [&](const std::shared_ptr<ov::frontend::tensorflow_lite::InputModel>& in_model) { | ||
auto simple_lambda = [&]() -> std::shared_ptr<ov::Model> { | ||
std::shared_ptr<ov::Model> model; | ||
if (in_model) | ||
translate_graph(in_model, fail_fast, no_conversion, model); | ||
return model; | ||
}; | ||
return simple_lambda; | ||
}; | ||
std::vector<std::function<std::shared_ptr<ov::Model>()>> submodel_translation_functions; | ||
submodel_translation_functions.reserve(subgraphs_as_input_models.size()); | ||
for (const auto& subgraph : subgraphs_as_input_models) { | ||
submodel_translation_functions.push_back(input_to_ov_model(subgraph)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I correct that you translate all sub-graphs even it is not needed? For example, in case of cutting some sub-graph can be redundant.
Is it possible to translate by demand inside translator for While
operation and cache it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not a full translation. It is just a creation of InputModel
. Here for each sub-graph I make creation functions from InputModel
to ov::Model
which are not triggered here -- they are triggered in the While op translator. So sub-graph is only getting translated during NodeContext::get_subgraph(int)
call.
continue; | ||
auto any_item = m_nodes[node_index]; | ||
bool is_op = any_item.is<const tflite::Operator*>(); | ||
FRONT_END_GENERAL_CHECK(is_op || any_item.is<int32_t>()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please put a message about the error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will handle it in the next PR
auto node = m_nodes[node_index].as<const tflite::Operator*>(); | ||
auto buffers = m_model->buffers(); | ||
|
||
std::map<size_t, TensorInfo> input_info = {}, output_info = {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::map<size_t, TensorInfo> input_info = {}, output_info = {}; | |
std::unordered_map<size_t, TensorInfo> input_info = {}, output_info = {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't a change of current PR. Getting decoder for operations was supported before. So I will change it in the next PR as it doesn't break the behavior
for (auto input : *node->inputs()) { | ||
if (input == -1) | ||
continue; | ||
auto buffer = (*buffers)[(*tensors)[input]->buffer()]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very complex construction:)
Can we have a check for each step: 1. input
in tensors
; 2. (*tensors)[input]->buffer() is valid; 3. (*tensors)[input]->buffer()
in buffers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the asserts are in place by flatbuffers
.
} else { | ||
type = tflite::EnumNamesBuiltinOperator()[operator_code->builtin_code()]; | ||
} | ||
if (type == "CUSTOM") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add a comment what is CUSTOM
type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find it quite self-explanatory, however if there is a need to explain it even more -- I will add comment in the next PR
} else { | ||
type = tflite::EnumNamesBuiltinOperator()[operator_code->builtin_code()]; | ||
output_it == outputs->end() ? -1 : static_cast<int64_t>(std::distance(outputs->begin(), output_it)); | ||
return std::make_shared<DecoderFlatBufferTensors>(info, input_idx, output_idx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't we check that input_idx
/output_idx
is not -1
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here we handle input and/or output tensors. we explicitly trigger them only for these types of tensors. so having -1 is okay
/// \brief Returns iterator for a subgraph created on demand | ||
/// If there is no query for specific sub-graph iterator shouldn't be created | ||
/// idx should be in range 0..get_subgraph_size()-1 | ||
std::shared_ptr<GraphIteratorFlatBuffer> get_subgraph(const size_t& idx) const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you have the common cache of already converted sub-graph accessible simultaneously to all sub-graphs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no I don't. there is no such use case in known TFL models
size_t subgraph_size = m_graph_iterator->get_subgraph_size(); | ||
if (subgraph_size > 1) { | ||
m_subgraphs.reserve(subgraph_size); | ||
m_subgraphs.push_back(nullptr); // no main graph | ||
for (size_t i = 1; i < subgraph_size; ++i) { | ||
m_subgraphs.push_back( | ||
std::make_shared<ov::frontend::tensorflow_lite::InputModel>(m_graph_iterator->get_subgraph(i), | ||
m_telemetry)); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't understand. On the on hand, above it is said that sub-graph is created on demand. On the other hand, you create InputModel
for all sub-graphs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commented in previous comment about the logic of this PR.
@@ -24,6 +24,9 @@ class InputModel : public ov::frontend::InputModel { | |||
std::map<std::string, std::shared_ptr<ov::frontend::tensorflow_lite::TensorLitePlace>> get_tensor_places() const; | |||
std::map<std::string, Output<Node>> get_tensor_values() const; | |||
|
|||
////// Subgraph Handling ///// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
strange comment:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I struggle to understand the action which is required here. Could you make your comments structural to make this review productive?
…notoolkit#17932) * Custom attribute reading and While operation support * Rearanges FLATBUFFERS_LOCALE_INDEPENDENT setting * Style * Make flatbuffers code as version independent as possible * Comments addressed
Tickets: