Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NPU] Switching the I/O identification convention to indices #24248

Merged
Show file tree
Hide file tree
Changes from 32 commits
Commits
Show all changes
46 commits
Select commit Hold shift + click to select a range
4bef825
Renaming the I/O descriptor structure
razvanapetroaie Apr 25, 2024
9321da5
Using indices for serializing the I/O metadata when the compiler vers…
razvanapetroaie May 1, 2024
64fa079
Reintroducing an additional shape attribute inside the "IODescriptor"
razvanapetroaie May 1, 2024
01a36d7
Updating the I/O metadata extraction performed inside the plugin-driv…
razvanapetroaie May 2, 2024
f6ead2c
Refactoring the "SyncInferRequest" class
razvanapetroaie May 2, 2024
4ec8608
Refactoring the Level Zero backend
razvanapetroaie May 2, 2024
6184d77
Adding back support for stateful and dynamic models
razvanapetroaie May 7, 2024
ed66429
Refactoring the "checkLevelZeroAttributesMatch" function
razvanapetroaie May 8, 2024
7b4064e
Fixing the accuracy issues
razvanapetroaie May 8, 2024
008f6bf
Removing a couple of unused functions
razvanapetroaie May 8, 2024
0c6e4b9
Adding more comments
razvanapetroaie May 9, 2024
64095cc
Constraining the input/output entries inside the dummy OV model const…
razvanapetroaie May 12, 2024
518d2fe
Fixing the batching implementation
razvanapetroaie May 13, 2024
0171e44
Fixing the "getBatchSize" function
razvanapetroaie May 13, 2024
01aa897
Removing some unused code passages
razvanapetroaie May 15, 2024
b470022
Restoring the optional "shapeFromIRModel" due to potential driver bug
razvanapetroaie May 20, 2024
c6dcdd2
Adding an extra log message in the batching implementation
razvanapetroaie May 20, 2024
1f77085
Adding more comments
razvanapetroaie May 20, 2024
49736c5
Adding a test checking whether models using duplicate node names work…
razvanapetroaie May 27, 2024
cc738a9
Solving clang formatter errors
razvanapetroaie May 28, 2024
ae37396
Moving the NPU test instantiaion to the OV repository
razvanapetroaie May 29, 2024
501b6d1
Renaming the tensor attributes
razvanapetroaie May 29, 2024
a8546f8
Removing the test instantiations for the template and GPU plugins
razvanapetroaie May 29, 2024
a6161ae
Creating the "icompiler.cpp" file
razvanapetroaie Jun 4, 2024
3fb4e45
Updating the compiler version
razvanapetroaie Jun 17, 2024
dfa2cc7
Solving the conflicts caused by the PR introducing the remote tensors
razvanapetroaie Jul 17, 2024
a0056a6
Making this thing compilable again
razvanapetroaie Jul 17, 2024
e717416
Batch size -> number of command lists inside the L0 pipeline
razvanapetroaie Jul 17, 2024
60f4fef
Solving more conflicts
razvanapetroaie Jul 22, 2024
593afa0
Adjusting the tensor names check inside the inference request class
razvanapetroaie Jul 22, 2024
925809b
Solving more conflicts, the last ones hopefully
razvanapetroaie Jul 22, 2024
8f3251b
Updating the compiler version used
razvanapetroaie Jul 22, 2024
e5960a1
Merge remote-tracking branch 'upstream/master' into EISW-121295-indic…
razvanapetroaie Jul 24, 2024
3b8d69b
Merge remote-tracking branch 'upstream/master' into EISW-121295-indic…
razvanapetroaie Jul 29, 2024
92e79eb
Merge branch 'master' into EISW-121295-indices-as-ids-poc
razvanapetroaie Jul 30, 2024
d9334df
Replacing NetworkMetadata::findByName with lambda functions
razvanapetroaie Jul 31, 2024
a2998f4
Merge branch 'master' into EISW-121295-indices-as-ids-poc
razvanapetroaie Jul 31, 2024
3d0a194
Updating the compiler version once more
razvanapetroaie Aug 1, 2024
c110096
Merge branch 'master' into EISW-121295-indices-as-ids-poc
razvanapetroaie Aug 1, 2024
4b0246d
Merge branch 'master' into EISW-121295-indices-as-ids-poc
razvanapetroaie Aug 5, 2024
c2f47b5
Merge branch 'master' into EISW-121295-indices-as-ids-poc
razvanapetroaie Aug 6, 2024
8543438
Merge branch 'master' into EISW-121295-indices-as-ids-poc
razvanapetroaie Aug 6, 2024
860552c
Merge remote-tracking branch 'upstream/master' into EISW-121295-indic…
razvanapetroaie Aug 7, 2024
7c106e3
Merge branch 'master' into EISW-121295-indices-as-ids-poc
razvanapetroaie Aug 9, 2024
4853fd7
Merge branch 'master' into EISW-121295-indices-as-ids-poc
razvanapetroaie Aug 11, 2024
d3e1657
Merge branch 'master' into EISW-121295-indices-as-ids-poc
razvanapetroaie Aug 12, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
// Copyright (C) 2018-2024 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//

#include "execution_graph_tests/duplicate_inputs_outputs_names.hpp"

#include "common_test_utils/test_constants.hpp"

using namespace ExecutionGraphTests;

namespace {

INSTANTIATE_TEST_SUITE_P(smoke_duplicateInputsOutputsNames,
ExecGraphDuplicateInputsOutputsNames,
::testing::Values(ov::test::utils::DEVICE_CPU),
ExecGraphDuplicateInputsOutputsNames::getTestCaseName);

} // namespace
127 changes: 95 additions & 32 deletions src/plugins/intel_npu/src/al/include/intel_npu/al/icompiler.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
#include <memory>
#include <set>
#include <string>
#include <string_view>
#include <unordered_map>
#include <unordered_set>

Expand All @@ -22,48 +23,110 @@
namespace intel_npu {

/**
* @brief A helper structure used for storing the metadata found within the I/O nodes.
* @details The "legacyName" attribute holds the name most commonly used as map key for multiple structures.
* This value also corresponds to the identifier used by the OpenVINO 1.0 API.
*
* "originalShape" corresponds to the shape registered in the graph, while "transposedShape" holds the shape obtained
* upon applying a transposition corresponding to the legacy layout value. Use the "transposedShape" one if not sure
* which one you need.
* @brief A helper structure used for storing metadata corresponding to one input/output entry.
*/
struct IONodeDescriptor {
std::string legacyName;
std::string currentNodeName;
struct IODescriptor {
Copy link
Contributor Author

@razvanapetroaie razvanapetroaie May 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the changes snowballed from here.

/**
* @brief The name of the input/output assigned by the compiler.
* @details This value may differ from other name attributes:
* - The compiler could have created additional inputs/outputs (e.g. for representing states). These are not
* found in the original IR model.
* - The compiler may append indices to names in the case where duplicate names are found.
* @note The prefixes introduced by the compiler in order to differentiate the special cases (e.g. states and shape
* tensors) were removed prior to initializing this field.
*/
std::string nameFromCompiler;

ov::element::Type precision;

ov::PartialShape shapeFromCompiler;

/**
* @brief If set to "true", the current object describes a buffer which may be used for altering a state tensor.
* @details This flag is set if the compiler prefixed the name using a "read value" prefix. The state input and
* state output descriptors are also tied using the "relatedDescriptorIndex" attribute.
*/
bool isStateInput = false;

/**
* @brief If set to "true", the current object describes a buffer which reflects the value of a state tensor.
* @details This flag is set if the compiler prefixed the name using an "assign" prefix. The state input and
* state output descriptors are also tied using the "relatedDescriptorIndex" attribute.
*/
bool isStateOutput = false;

/**
* @brief If set to "true", the buffer of the tensor described here contains as value the shape of the referenced
* tensor.
* @details This flag is set if the compiler prefixed the name using a "shape" prefix.
*
* The referenced tensor bears the same name ("nameFromCompiler"), but its "isShapeTensor" value is set to
* "false". The two descriptors are also tied using the "relatedDescriptorIndex" attribute.
*/
bool isShapeTensor = false;

/**
* @brief Points towards a related descriptor.
* @details The related descriptors are defined by (state input, state output) or (dynamic tensor, shape tensor)
* pairs.
*/
std::optional<size_t> relatedDescriptorIndex;

/**
* @brief The friendly name of the node extracted from the IR model.
* @details In some cases, this field is required for constructing a dummy model which uses the same input/output
* metadata as the original IR model.
*
* This field may be empty if the I/O entry is not found in the original IR model (i.e. the entry was added by the
* compiler).
*/
std::string nodeFriendlyName;

/**
* @brief The names of the output tensors extracted from the IR model.
* @details In some cases, this field is required for constructing a dummy model which uses the same input/output
* metadata as the original IR model.
*
* This field may be empty if the I/O entry is not found in the original IR model (i.e. the entry was added by the
* compiler).
*/
std::unordered_set<std::string> outputTensorNames;
ov::element::Type_t precision;
ov::PartialShape originalShape;
ov::PartialShape transposedShape;
};

/**
* @brief A helper map to represent descriptions for inputs and outputs
* of a network
*/
using IONodeDescriptorMap = std::unordered_map<std::string, IONodeDescriptor>;
/**
* @brief The shape extracted from the IR model.
* @details The values may differ from the ones found in "shapeFromCompiler" if batching is to be handled by the
* plugin.
*
* This field may be empty if the I/O entry is not found in the original IR model (i.e. the entry was added
* by the compiler).
*/
std::optional<ov::PartialShape> shapeFromIRModel = std::nullopt;
};

struct NetworkMetadata final {
std::string name;

std::vector<std::string> inputNames;
std::vector<std::string> outputNames;
std::vector<std::string> stateNames;
std::vector<std::string> shapeNames;
std::vector<IODescriptor> inputs;
std::vector<IODescriptor> outputs;
std::vector<IODescriptor> profilingOutputs;

IONodeDescriptorMap parameters;
IONodeDescriptorMap results;
IONodeDescriptorMap states;
IONodeDescriptorMap shapes;
IONodeDescriptorMap profilingOutputs;
size_t numStreams = 1;

std::unordered_map<std::string, size_t> inputOrder;
std::unordered_map<std::string, size_t> outputOrder;
/**
* @brief Binds the (state input, state output) and (dynamic tensor, shape tensor) pairs using the
* "relatedDescriptorIndex" attribute.
* @details For state inputs, the "relatedDescriptorIndex" value is set to the index of the output which bears the
* same name. The reverse is also applied.
*
* For shape tensors, the lookup is performed in the same container (inputs or outputs). The value is once again set
* to the index of the entry which bears the same name.
*/
void bindRelatedDescriptors();
razvanapetroaie marked this conversation as resolved.
Show resolved Hide resolved

int numStreams = 1;
};
private:
std::optional<size_t> findByName(const std::vector<IODescriptor>& descriptors, const std::string_view targetName);

}; // namespace intel_npu

/**
* @struct NetworkDescription
Expand Down
113 changes: 42 additions & 71 deletions src/plugins/intel_npu/src/al/include/sync_infer_request.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -92,56 +92,32 @@ class SyncInferRequest : public ov::IInferRequest {
*/
void initialize_states();

protected:
/**
* @return The state tensors accessible by their names.
*/
std::unordered_map<std::string, std::shared_ptr<VariableState>>& get_variable_states() {
return _variableStates;
}

/**
* @return The names used by the inputs in the order registered inside the model.
*/
std::vector<std::string> get_input_names() {
return _metadata.inputNames;
}

/**
* @return The names used by the outputs in the order registered inside the model.
*/
std::vector<std::string> get_output_names() {
return _metadata.outputNames;
}

/**
* @return The names used by the state variables in the order registered inside the model.
* @see ov::ISyncInferRequest
*/
std::vector<std::string> get_state_names() {
return _metadata.stateNames;
}
struct FoundPort {
size_t idx;
enum class Type { NOT_FOUND = 0, INPUT, OUTPUT } type;

/**
* @return The names used by the shape variables in the order registered inside the model.
*/
std::vector<std::string> get_shape_names() {
return _metadata.shapeNames;
}
bool found() {
return type != Type::NOT_FOUND;
}
bool is_input() {
return type == Type::INPUT;
}
bool is_output() {
return !is_input();
}
};

/**
* @return A map holding references towards all tensors used by the current inference request object.
* @brief Finds input or output port
* @return structure which contains index of Input/Output or report that port wasn't found
* @see ov::ISyncInferRequest
*/
std::unordered_map<std::string, std::shared_ptr<ov::ITensor>>& get_all_tensors() {
return _allTensors;
}
FoundPort find_port(const ov::Output<const ov::Node>& port) const;
Copy link
Contributor

@lmielick lmielick Jun 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we expect this to be used in IMD? If not let's consider moving into ZeroInferRequest

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it is essential for IMD too. The method is used inside the get_tensor and set_tensor functions which are the standard means of interacting with the inputs/outputs used by the backend.


/**
* @return A map holding references towards all shapes tensors used by the current inference request object.
*/
std::unordered_map<std::string, std::shared_ptr<ov::ITensor>>& get_shapes_tensors() {
return _shapesTensors;
}

protected:
/**
* @brief Basic checks for input/output tensor
*
Expand All @@ -163,45 +139,40 @@ class SyncInferRequest : public ov::IInferRequest {
virtual void check_network_precision(const ov::element::Type_t precision) const = 0;

/**
* @brief Indicates a kind of provided tensor. Marks special tensors, used for internal implementation
*/
enum class TensorType { InputOrOutput, Shape, State };

/**
* @brief Allocates a tensor on host and stores the reference inside the "_allTensors" attribute. If a buffer
* address is provided, then the tensor is built upon it and no additional data buffer is allocated.
* @param tensorName The name by which the tensor shall be identified
* @brief Allocates a tensor on host and stores the reference inside multiple attributes.
* @param descriptor Tensor's metadata
* @param isState If true, the tensor shall also be stored inside the state variables map. In this case, adding the
* tensor to this structure would be required in order to correctly answer the state queries.
* @param index The index which the allocated tensor shall use.
* @param isInput Determines the containers in which the newly allocated tensors will be stored.
* @param allocator If provided, the tensor uses the custom allocator instead of using the default one.
* @param batchSize If provided, the value of the shape on the 0th axis is overriden with this value.
* @return Pointer towards the allocated tensor
*/
void allocate_tensor(std::string tensorName,
const IONodeDescriptor& descriptor,
TensorType tensorType = TensorType::InputOrOutput,
const ov::Allocator& allocator = {}) const;

// Mutable to return reference to ov::Tensor
mutable std::unordered_map<std::string, std::shared_ptr<ov::ITensor>> _allTensors;
mutable std::unordered_map<std::string, std::shared_ptr<ov::ITensor>> _shapesTensors;
// A copy of each tensor is needed to maintain the original L0 memory allocation in case the user provides another
// memory area for the tensor.
mutable std::unordered_map<std::string, std::shared_ptr<ov::ITensor>> _copyAllTensors;

mutable std::unordered_map<std::string, std::shared_ptr<VariableState>> _variableStates;
std::shared_ptr<ov::ITensor> allocate_tensor(const IODescriptor& descriptor,
const size_t index,
const bool isInput,
const ov::Allocator& allocator = {},
const std::optional<std::size_t> batchSize = std::nullopt) const;

// This is intel_npu::ICompiledModel pointer, but need to use OV base class because
// ov::IInferRequest::get_compiled_model returns a refernce to shared_ptr!
std::shared_ptr<const ov::ICompiledModel> _compiledModel;

NetworkMetadata _metadata;

// Stored in order to avoid additional processing when launching inferences
std::vector<std::string> _inputAndStateInputNames;
std::vector<std::string> _outputAndStateOutputNames;
mutable std::vector<std::shared_ptr<ov::ITensor>> _userInputTensors;
mutable std::vector<std::shared_ptr<ov::ITensor>> _userOutputTensors;

std::unordered_map<std::string, std::string> _nodeNameToLegacyName;
std::unordered_map<std::string, std::string> _legacyNameToNodeName;
mutable std::vector<ov::SoPtr<ov::IVariableState>> _variableStates;

/**
* @see ov::ISyncInferRequest
*/
mutable std::unordered_map<size_t, FoundPort> _cachedPorts;

/**
* @see ov::ISyncInferRequest
*/
mutable std::mutex _cacheMutex;
};

} // namespace intel_npu
69 changes: 69 additions & 0 deletions src/plugins/intel_npu/src/al/src/icompiler.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
// Copyright (C) 2018-2024 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//

#include "intel_npu/al/icompiler.hpp"

namespace intel_npu {

std::optional<size_t> NetworkMetadata::findByName(const std::vector<IODescriptor>& descriptors,
const std::string_view targetName) {
for (size_t descriptorIndex = 0; descriptorIndex < descriptors.size(); ++descriptorIndex) {
if (descriptors.at(descriptorIndex).nameFromCompiler == targetName) {
return descriptorIndex;
}
}

return std::nullopt;
}

void NetworkMetadata::bindRelatedDescriptors() {
size_t ioIndex = 0;

for (IODescriptor& input : inputs) {
if (input.relatedDescriptorIndex.has_value()) {
++ioIndex;
continue;
}

if (input.isStateInput) {
const std::optional<size_t> relatedDescriptorIndex = findByName(outputs, input.nameFromCompiler);

if (relatedDescriptorIndex.has_value()) {
input.relatedDescriptorIndex = relatedDescriptorIndex;
outputs.at(*relatedDescriptorIndex).relatedDescriptorIndex = std::optional(ioIndex);
}
} else if (input.isShapeTensor) {
const std::optional<size_t> relatedDescriptorIndex = findByName(inputs, input.nameFromCompiler);

if (relatedDescriptorIndex.has_value() && *relatedDescriptorIndex != ioIndex) {
input.relatedDescriptorIndex = relatedDescriptorIndex;
inputs.at(*relatedDescriptorIndex).relatedDescriptorIndex = std::optional(ioIndex);
}
}

++ioIndex;
}

ioIndex = 0;

for (IODescriptor& output : outputs) {
if (output.relatedDescriptorIndex.has_value()) {
++ioIndex;
continue;
}

if (output.isShapeTensor) {
const std::optional<size_t> relatedDescriptorIndex = findByName(outputs, output.nameFromCompiler);

if (relatedDescriptorIndex.has_value() && *relatedDescriptorIndex != ioIndex) {
output.relatedDescriptorIndex = relatedDescriptorIndex;
outputs.at(*relatedDescriptorIndex).relatedDescriptorIndex = std::optional(ioIndex);
}
}

++ioIndex;
}
}

} // namespace intel_npu
Loading
Loading