Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalize model cache reusing #21492

Merged

Conversation

riverlijunjie
Copy link
Contributor

Details:

  • Generalize model cache reusing

Tickets:

  • 126364

@github-actions github-actions bot added category: inference OpenVINO Runtime library - Inference category: GPU OpenVINO GPU plugin category: CPU OpenVINO CPU plugin labels Dec 6, 2023
@riverlijunjie riverlijunjie force-pushed the river/model_cache_format branch 3 times, most recently from 924fc2a to 824da2b Compare December 6, 2023 14:19
@ilya-lavrenov ilya-lavrenov self-assigned this Dec 6, 2023
@riverlijunjie riverlijunjie force-pushed the river/model_cache_format branch from 824da2b to 599496a Compare December 7, 2023 00:24
src/plugins/intel_gpu/src/plugin/plugin.cpp Outdated Show resolved Hide resolved
auto& ov_version = ov::get_openvino_version();
m_compiled_model_format["OV_VERSION"] = ov_version.buildNumber;
for (const auto& device : m_device_map) {
m_compiled_model_format[device.first] = device.second->get_info().driver_version;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose we need to take driver version specific to device we are compile_model on.
But I hope that a driver is the same for all devices and we can collect only driver of first device here (otherwise depending on actual number of devices on the system, we can have different m_compiled_model_format)

@isanghao could you please confirm?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We print the GPU info in one iGPU + dGPU machine:

    for (const auto& device : m_device_map) {
        auto driver_info = device.second->get_info();
        std::cout << device.first << ": (" << driver_info.dev_name << ", " << driver_info.vendor_id << ", "
                  << driver_info.device_id << ", " << driver_info.driver_version << ")" << std::endl;
    }
0: (Intel(R) UHD Graphics 770,         32902, 42880, 23.17.26241.33)
1: (Intel(R) Arc(TM) A770 Graphics, 32902, 22176, 23.17.26241.33)
2: (Intel(R) Arc(TM) A770 Graphics, 32902, 22176, 23.17.26241.33)

It seems iGPU and dGPU have the same driver version, I'm not sure whether they can be different driver version in the same machine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose driver version is defined by driver_info.driver_version which is the same for all devices, while other values are per-device properties. I hope that device_id and vendor_id are driver independent.

CC @isanghao

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@riverlijunjie I suppose we need to store driver version for a single device, because depending on a number of devices on the system we will have different value for m_compiled_model_format

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, will do it!

src/plugins/intel_gpu/src/plugin/plugin.cpp Outdated Show resolved Hide resolved
src/plugins/intel_gpu/src/plugin/plugin.cpp Outdated Show resolved Hide resolved
@riverlijunjie riverlijunjie force-pushed the river/model_cache_format branch 2 times, most recently from d073124 to cae554a Compare December 8, 2023 02:34
@riverlijunjie riverlijunjie marked this pull request as ready for review December 8, 2023 12:49
@riverlijunjie riverlijunjie requested review from a team as code owners December 8, 2023 12:49
@riverlijunjie riverlijunjie changed the title [WIP]Generalize model cache reusing Generalize model cache reusing Dec 8, 2023
src/plugins/intel_gpu/include/intel_gpu/plugin/plugin.hpp Outdated Show resolved Hide resolved
"[GPU] compiled_model_format: Couldn't find device for GPU with id ",
m_compiled_model_format_device_id);
// Set specified device info for compiled_model_format
model_format_map["DEVICE_ID"] = m_compiled_model_format_device_id;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need to keep device_id ?
caching_properties already contain device architecture, which identifies the device we compile for

Copy link
Contributor Author

@riverlijunjie riverlijunjie Dec 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any scenario that device_id changed with the same device if there are multiple GPUs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use driver_info.vendor_id + driver_info.device_id to guarantee actual device not change?
That means if driver_info.vendor_id + driver_info.device_id is same, the device should have the same hardware configuration (cache size, EU number)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any scenario that device_id changed with the same device if there are multiple GPUs?

yes, if we have multiple discrete GPUs and one of them is removed from that machine between consequent runs.

We can use driver_info.vendor_id + driver_info.device_id to guarantee actual device not change?

We don't need to guarantee it in compile_model_format property. It's a responsibility of caching_properties, which holds ov::device::architecture, which does not depend on actual order of the devices.

  • caching_properties defines a list of properties, which describe device and hardcoded compilation options, which affect final blob and its hash
  • compiled_model_format holds the properties, which can later be changed in runtime (they are not hardcoded).

For example, we compiled out blob and later load it with:

  • Different OV runtime version. For plugins like CPU, GPU it's sensitive that OV runtime should be the same, because there are no guarantees that the same blob can be used with multiple OV versions.
  • Different driver version (read as execution runtime) for GPU, NPU devices. Drivers can be incompatible with compiled blobs compiled with other drivers (read compilers)

Maybe we need a better name for compiled_model_format to denote a list of properties, which can be changed in future runs and lead to compiled blob removal (because system is updated and new blob should be triggered).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

compiled_model_format holds some runtime software dependencies for device model running, such as driver version, etc. While caching_properties should hold hardware information, such as architecture, EU number, right?

But we can see that in gpu plugin, caching_properties has contained ov::intel_gpu::driver_version, so seems that driver_version is unnecessary in compiled_model_format ?

std::vector<ov::PropertyName> Plugin::get_caching_properties() const {
static const std::vector<ov::PropertyName> caching_properties = {
ov::PropertyName{ov::device::architecture.name(), PropertyMutability::RO},
ov::PropertyName{ov::intel_gpu::execution_units_count.name(), PropertyMutability::RO},
ov::PropertyName{ov::intel_gpu::driver_version.name(), PropertyMutability::RO},
ov::PropertyName{ov::hint::inference_precision.name(), PropertyMutability::RW},
ov::PropertyName{ov::hint::execution_mode.name(), PropertyMutability::RW},
};
return caching_properties;
}

By the way, if we need a proper name for compiled_model_format, I vote compiled_model_runtime_properties

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we can see that in gpu plugin, caching_properties has contained ov::intel_gpu::driver_version

I suppose we need to move it from caching_properties to compiled_model_format. Otherwise, if driver version will be changed, blob hash is changed as well => blob will always be located on the file system and will not be removed by OpenVINO.
If we move driver_version to compiled_model_format, then hash is the same, but OV detects that compiled_model_format is changed => compiled blob must be dropped and regenerated.

compiled_model_format, I vote compiled_model_runtime_properties

I agree

// Set specified device info for compiled_model_format
model_format_map["DEVICE_ID"] = m_compiled_model_format_device_id;
model_format_map["DRIVER_VERSION"] =
m_device_map.at(m_compiled_model_format_device_id)->get_info().driver_version;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but as we saw, driver version does not depend on actual device, because when you install driver on the system, it works for all devices.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean we need bind to the actual device, not driver version?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actual device can be represented by driver_info.vendor_id + driver_info.device_id ?
So that we apply below 3 items:

  1. vendor_id
  2. device_id
  3. driver_version

Does it make sense? @ilya-lavrenov

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, have a look at https://github.com/openvinotoolkit/openvino/pull/21492/files#r1426253560
IMO we need to cover only driver version here, because device architecture is already a part of caching_properties.

Copy link
Contributor

@ArtemySkrebkov ArtemySkrebkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good from functionality perspective. 👍

Looking forward to checking it out with NPU.

src/inference/src/dev/core_impl.cpp Outdated Show resolved Hide resolved
@riverlijunjie riverlijunjie force-pushed the river/model_cache_format branch from 78f639f to b5f0a31 Compare December 14, 2023 10:49
*
* @ingroup ov_dev_api_plugin_api
*/
static constexpr Property<std::string, PropertyMutability::RO> compiled_model_runtime_properties{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it can contain a vector<string> and actual conversion to string can be done on Core level

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May I know any benefits if we choose std::vector<std::string> rather than std::string?
From my understanding, core level doesn't need understand its contents and only read it from blob file's header or send it to plugin side, so std::string is enough, right?

Copy link
Contributor

@ilya-lavrenov ilya-lavrenov Dec 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, it will simplify plugin code, because currently they have to parse / convert from / to string

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In plugin side, we adopt ov::AnyMap and convert it to or from std::string, seems it is not too complex?

@ilya-lavrenov ilya-lavrenov enabled auto-merge (squash) December 19, 2023 10:53
@p-durandin
Copy link
Contributor

Please fix ie_tests_win_cpu_vs2019_release test

@ilya-lavrenov ilya-lavrenov merged commit 7b1074b into openvinotoolkit:master Dec 19, 2023
100 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: CPU OpenVINO CPU plugin category: GPU OpenVINO GPU plugin category: inference OpenVINO Runtime library - Inference Code Freeze
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants