Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eisw 121295 indices as ids poc backup #24

Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
e1211f3
Renaming the I/O descriptor structure
razvanapetroaie Apr 25, 2024
74871bb
Using indices for serializing the I/O metadata when the compiler vers…
razvanapetroaie May 1, 2024
74a3764
Reintroducing an additional shape attribute inside the "IODescriptor"
razvanapetroaie May 1, 2024
23710dd
Updating the I/O metadata extraction performed inside the plugin-driv…
razvanapetroaie May 2, 2024
ac73acd
Refactoring the "SyncInferRequest" class
razvanapetroaie May 2, 2024
64ba3cd
Refactoring the Level Zero backend
razvanapetroaie May 2, 2024
aa19290
Adding back support for stateful and dynamic models
razvanapetroaie May 7, 2024
d3612aa
Refactoring the "checkLevelZeroAttributesMatch" function
razvanapetroaie May 8, 2024
64fe760
Fixing the accuracy issues
razvanapetroaie May 8, 2024
3860ffc
Removing a couple of unused functions
razvanapetroaie May 8, 2024
59a019c
Adding more comments
razvanapetroaie May 9, 2024
e3157ea
Constraining the input/output entries inside the dummy OV model const…
razvanapetroaie May 12, 2024
81f52d3
Fixing the batching implementation
razvanapetroaie May 13, 2024
db634de
Fixing the "getBatchSize" function
razvanapetroaie May 13, 2024
7997aa0
Removing some unused code passages
razvanapetroaie May 15, 2024
7b0a66a
Restoring the optional "shapeFromIRModel" due to potential driver bug
razvanapetroaie May 20, 2024
2f59b32
Adding an extra log message in the batching implementation
razvanapetroaie May 20, 2024
5a502be
Adding more comments
razvanapetroaie May 20, 2024
4432a71
Adding a test checking whether models using duplicate node names work…
razvanapetroaie May 27, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
// Copyright (C) 2018-2024 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//

#include "execution_graph_tests/duplicate_inputs_outputs_names.hpp"

#include "common_test_utils/test_constants.hpp"

using namespace ExecutionGraphTests;

namespace {

INSTANTIATE_TEST_SUITE_P(smoke_duplicateInputsOutputsNames,
ExecGraphDuplicateInputsOutputsNames,
::testing::Values(ov::test::utils::DEVICE_CPU),
ExecGraphDuplicateInputsOutputsNames::getTestCaseName);

} // namespace
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
// Copyright (C) 2018-2024 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//

#include "execution_graph_tests/duplicate_inputs_outputs_names.hpp"
#include "common_test_utils/test_constants.hpp"

using namespace ExecutionGraphTests;

namespace {

INSTANTIATE_TEST_SUITE_P(smoke_duplicateInputsOutputsNames, ExecGraphDuplicateInputsOutputsNames,
::testing::Values(ov::test::utils::DEVICE_GPU),
ExecGraphDuplicateInputsOutputsNames::getTestCaseName);

} // namespace
180 changes: 148 additions & 32 deletions src/plugins/intel_npu/src/al/include/intel_npu/al/icompiler.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
#include <memory>
#include <set>
#include <string>
#include <string_view>
#include <unordered_map>
#include <unordered_set>

Expand All @@ -22,48 +23,163 @@
namespace intel_npu {

/**
* @brief A helper structure used for storing the metadata found within the I/O nodes.
* @details The "legacyName" attribute holds the name most commonly used as map key for multiple structures.
* This value also corresponds to the identifier used by the OpenVINO 1.0 API.
*
* "originalShape" corresponds to the shape registered in the graph, while "transposedShape" holds the shape obtained
* upon applying a transposition corresponding to the legacy layout value. Use the "transposedShape" one if not sure
* which one you need.
* @brief A helper structure used for storing metadata corresponding to one input/output entry.
*/
struct IONodeDescriptor {
std::string legacyName;
std::string currentNodeName;
struct IODescriptor {
/**
* @brief The name of the input/output assigned by the compiler.
* @details This value may differ from other name attributes:
* - The compiler could have created additional inputs/outputs (e.g. for representing states). These are not
* found in the original IR model.
* - The compiler may append indices to names in the case where duplicate names are found.
* @note The prefixes introduced by the compiler in order to differentiate the special cases (e.g. states and shape
* tensors) were removed prior to initializing this field.
*/
std::string nameFromCompiler;

ov::element::Type precision;

ov::PartialShape shapeFromCompiler;

/**
* @brief If set to "true", the current object describes a buffer which may be used for altering a state tensor.
* @details This flag is set if the compiler prefixed the name using a "read value" prefix. The state input and
* state output descriptors are also tied using the "relatedDescriptorIndex" attribute.
*/
bool isStateInput = false;

/**
* @brief If set to "true", the current object describes a buffer which reflects the value of a state tensor.
* @details This flag is set if the compiler prefixed the name using an "assign" prefix. The state input and
* state output descriptors are also tied using the "relatedDescriptorIndex" attribute.
*/
bool isStateOutput = false;

/**
* @brief If set to "true", the buffer of the tensor described here contains as value the shape of the referenced
* tensor.
* @details This flag is set if the compiler prefixed the name using a "shape" prefix.
*
* The referenced tensor bears the same name ("nameFromCompiler"), but its "isShapeTensor" value is set to
* "false". The two descriptors are also tied using the "relatedDescriptorIndex" attribute.
*/
bool isShapeTensor = false;

/**
* @brief Points towards a related descriptor.
* @details The related descriptors are defined by (state input, state output) or (dynamic tensor, shape tensor)
* pairs.
*/
std::optional<size_t> relatedDescriptorIndex;

/**
* @brief The friendly name of the node extracted from the IR model.
* @details In some cases, this field is required for constructing a dummy model which uses the same input/output
* metadata as the original IR model.
*
* This field may be empty if the I/O entry is not found in the original IR model (i.e. the entry was added by the
* compiler).
*/
std::string nodeFriendlyName;

/**
* @brief The names of the output tensors extracted from the IR model.
* @details In some cases, this field is required for constructing a dummy model which uses the same input/output
* metadata as the original IR model.
*
* This field may be empty if the I/O entry is not found in the original IR model (i.e. the entry was added by the
* compiler).
*/
std::unordered_set<std::string> outputTensorNames;
ov::element::Type_t precision;
ov::PartialShape originalShape;
ov::PartialShape transposedShape;
};

/**
* @brief A helper map to represent descriptions for inputs and outputs
* of a network
*/
using IONodeDescriptorMap = std::unordered_map<std::string, IONodeDescriptor>;
/**
* @brief The shape extracted from the IR model.
* @details The values may differ from the ones found in "shapeFromCompiler" if batching is to be handled by the
* plugin.
*
* This field may be empty if the I/O entry is not found in the original IR model (i.e. the entry was added
* by the compiler).
*/
std::optional<ov::PartialShape> shapeFromIRModel = std::nullopt;
};

struct NetworkMetadata final {
std::string name;

std::vector<std::string> inputNames;
std::vector<std::string> outputNames;
std::vector<std::string> stateNames;
std::vector<std::string> shapeNames;
std::vector<IODescriptor> inputs;
std::vector<IODescriptor> outputs;
std::vector<IODescriptor> profilingOutputs;

IONodeDescriptorMap parameters;
IONodeDescriptorMap results;
IONodeDescriptorMap states;
IONodeDescriptorMap shapes;
IONodeDescriptorMap profilingOutputs;
size_t numStreams = 1;

std::unordered_map<std::string, size_t> inputOrder;
std::unordered_map<std::string, size_t> outputOrder;
std::optional<size_t> findByName(const std::vector<IODescriptor>& descriptors, const std::string_view targetName) {
for (size_t descriptorIndex = 0; descriptorIndex < descriptors.size(); ++descriptorIndex) {
if (descriptors.at(descriptorIndex).nameFromCompiler == targetName) {
return std::optional(descriptorIndex);
}
}

int numStreams = 1;
};
return std::nullopt;
}

/**
* @brief Binds the (state input, state output) and (dynamic tensor, shape tensor) pairs using the
* "relatedDescriptorIndex" attribute.
* @details For state inputs, the "relatedDescriptorIndex" value is set to the index of the output which bears the
* same name. The reverse is also applied.
*
* For shape tensors, the lookup is performed in the same container (inputs or outputs). The value is once again set
* to the index of the entry which bears the same name.
*/
void bindRelatedDescriptors() {
size_t ioIndex = 0;

for (IODescriptor& input : inputs) {
if (input.relatedDescriptorIndex.has_value()) {
++ioIndex;
continue;
}

if (input.isStateInput) {
const std::optional<size_t> relatedDescriptorIndex = findByName(outputs, input.nameFromCompiler);

if (relatedDescriptorIndex.has_value()) {
input.relatedDescriptorIndex = relatedDescriptorIndex;
outputs.at(*relatedDescriptorIndex).relatedDescriptorIndex = std::optional(ioIndex);
}
} else if (input.isShapeTensor) {
const std::optional<size_t> relatedDescriptorIndex = findByName(inputs, input.nameFromCompiler);

if (relatedDescriptorIndex.has_value() && *relatedDescriptorIndex != ioIndex) {
input.relatedDescriptorIndex = relatedDescriptorIndex;
inputs.at(*relatedDescriptorIndex).relatedDescriptorIndex = std::optional(ioIndex);
}
}

++ioIndex;
}

ioIndex = 0;

for (IODescriptor& output : outputs) {
if (output.relatedDescriptorIndex.has_value()) {
++ioIndex;
continue;
}

if (output.isShapeTensor) {
const std::optional<size_t> relatedDescriptorIndex = findByName(outputs, output.nameFromCompiler);

if (relatedDescriptorIndex.has_value() && *relatedDescriptorIndex != ioIndex) {
output.relatedDescriptorIndex = relatedDescriptorIndex;
outputs.at(*relatedDescriptorIndex).relatedDescriptorIndex = std::optional(ioIndex);
}
}

++ioIndex;
}
}
}; // namespace intel_npu

/**
* @struct NetworkDescription
Expand Down
107 changes: 40 additions & 67 deletions src/plugins/intel_npu/src/al/include/sync_infer_request.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -92,56 +92,32 @@ class SyncInferRequest : public ov::IInferRequest {
*/
void initialize_states();

protected:
/**
* @return The state tensors accessible by their names.
*/
std::unordered_map<std::string, std::shared_ptr<VariableState>>& get_variable_states() {
return _variableStates;
}

/**
* @return The names used by the inputs in the order registered inside the model.
*/
std::vector<std::string> get_input_names() {
return _metadata.inputNames;
}

/**
* @return The names used by the outputs in the order registered inside the model.
*/
std::vector<std::string> get_output_names() {
return _metadata.outputNames;
}

/**
* @return The names used by the state variables in the order registered inside the model.
*/
std::vector<std::string> get_state_names() {
return _metadata.stateNames;
}

/**
* @return The names used by the shape variables in the order registered inside the model.
* @see ov::ISyncInferRequest
*/
std::vector<std::string> get_shape_names() {
return _metadata.shapeNames;
}
struct FoundPort {
size_t idx;
enum class Type { NOT_FOUND = 0, INPUT, OUTPUT } type;

/**
* @return A map holding references towards all tensors used by the current inference request object.
*/
std::unordered_map<std::string, std::shared_ptr<ov::ITensor>>& get_all_tensors() {
return _allTensors;
}
bool found() {
return type != Type::NOT_FOUND;
}
bool is_input() {
return type == Type::INPUT;
}
bool is_output() {
return !is_input();
}
};

/**
* @return A map holding references towards all shapes tensors used by the current inference request object.
* @brief Finds input or output port
* @return structure which contains index of Input/Output or report that port wasn't found
* @see ov::ISyncInferRequest
*/
std::unordered_map<std::string, std::shared_ptr<ov::ITensor>>& get_shapes_tensors() {
return _shapesTensors;
}
FoundPort find_port(const ov::Output<const ov::Node>& port) const;

protected:
/**
* @brief Basic checks for input/output tensor
*
Expand All @@ -163,45 +139,42 @@ class SyncInferRequest : public ov::IInferRequest {
virtual void check_network_precision(const ov::element::Type_t precision) = 0;

/**
* @brief Indicates a kind of provided tensor. Marks special tensors, used for internal implementation
*/
enum class TensorType { InputOrOutput, Shape, State };

/**
* @brief Allocates a tensor on host and stores the reference inside the "_allTensors" attribute. If a buffer
* address is provided, then the tensor is built upon it and no additional data buffer is allocated.
* @param tensorName The name by which the tensor shall be identified
* @brief Allocates a tensor on host and stores the reference inside multiple attributes.
* @param descriptor Tensor's metadata
* @param isState If true, the tensor shall also be stored inside the state variables map. In this case, adding the
* tensor to this structure would be required in order to correctly answer the state queries.
* @param isInput Determines the containers in which the newly allocated tensors will be stored.
* @param allocator If provided, the tensor uses the custom allocator instead of using the default one.
* @param batchSize If provided, the value of the shape on the 0th axis is overriden with this value.
*/
void allocate_tensor(std::string tensorName,
const IONodeDescriptor& descriptor,
TensorType tensorType = TensorType::InputOrOutput,
const ov::Allocator& allocator = {});
void allocate_tensor(const IODescriptor& descriptor,
const bool isInput,
const ov::Allocator& allocator = {},
const std::optional<std::size_t> batchSize = std::nullopt);

std::vector<std::shared_ptr<ov::ITensor>> _inputTensors;
std::vector<std::shared_ptr<ov::ITensor>> _outputTensors;

// Mutable to return reference to ov::Tensor
mutable std::unordered_map<std::string, std::shared_ptr<ov::ITensor>> _allTensors;
mutable std::unordered_map<std::string, std::shared_ptr<ov::ITensor>> _shapesTensors;
// A copy of each tensor is needed to maintain the original L0 memory allocation in case the user provides another
// memory area for the tensor.
std::unordered_map<std::string, std::shared_ptr<ov::ITensor>> _copyAllTensors;
std::vector<std::shared_ptr<ov::ITensor>> _copyInputTensors;
std::vector<std::shared_ptr<ov::ITensor>> _copyOutputTensors;

std::unordered_map<std::string, std::shared_ptr<VariableState>> _variableStates;
std::vector<ov::SoPtr<ov::IVariableState>> _variableStates;

// This is intel_npu::ICompiledModel pointer, but need to use OV base class because
// ov::IInferRequest::get_compiled_model returns a refernce to shared_ptr!
std::shared_ptr<const ov::ICompiledModel> _compiledModel;

NetworkMetadata _metadata;

// Stored in order to avoid additional processing when launching inferences
std::vector<std::string> _inputAndStateInputNames;
std::vector<std::string> _outputAndStateOutputNames;
/**
* @see ov::ISyncInferRequest
*/
mutable std::unordered_map<size_t, FoundPort> _cachedPorts;

std::unordered_map<std::string, std::string> _nodeNameToLegacyName;
std::unordered_map<std::string, std::string> _legacyNameToNodeName;
/**
* @see ov::ISyncInferRequest
*/
mutable std::mutex _cacheMutex;
};

} // namespace intel_npu
Loading
Loading