Generalize model cache reusing #21492

riverlijunjie · 2023-12-06T06:03:23Z

Details:

Generalize model cache reusing

Tickets:

126364

src/plugins/intel_gpu/src/plugin/plugin.cpp

ilya-lavrenov · 2023-12-07T11:38:38Z

src/plugins/intel_gpu/src/plugin/plugin.cpp

+    auto& ov_version = ov::get_openvino_version();
+    m_compiled_model_format["OV_VERSION"] = ov_version.buildNumber;
+    for (const auto& device : m_device_map) {
+        m_compiled_model_format[device.first] = device.second->get_info().driver_version;


I suppose we need to take driver version specific to device we are compile_model on.
But I hope that a driver is the same for all devices and we can collect only driver of first device here (otherwise depending on actual number of devices on the system, we can have different m_compiled_model_format)

@isanghao could you please confirm?

We print the GPU info in one iGPU + dGPU machine:

for (const auto& device : m_device_map) { auto driver_info = device.second->get_info(); std::cout << device.first << ": (" << driver_info.dev_name << ", " << driver_info.vendor_id << ", " << driver_info.device_id << ", " << driver_info.driver_version << ")" << std::endl; }

0: (Intel(R) UHD Graphics 770, 32902, 42880, 23.17.26241.33) 1: (Intel(R) Arc(TM) A770 Graphics, 32902, 22176, 23.17.26241.33) 2: (Intel(R) Arc(TM) A770 Graphics, 32902, 22176, 23.17.26241.33)

It seems iGPU and dGPU have the same driver version, I'm not sure whether they can be different driver version in the same machine.

I suppose driver version is defined by driver_info.driver_version which is the same for all devices, while other values are per-device properties. I hope that device_id and vendor_id are driver independent.

CC @isanghao

@riverlijunjie I suppose we need to store driver version for a single device, because depending on a number of devices on the system we will have different value for m_compiled_model_format

Got it, will do it!

src/plugins/intel_gpu/src/plugin/plugin.cpp

src/inference/src/dev/core_impl.cpp

src/plugins/intel_gpu/include/intel_gpu/plugin/plugin.hpp

ilya-lavrenov · 2023-12-13T07:41:13Z

src/plugins/intel_gpu/src/plugin/plugin.cpp

+                        "[GPU] compiled_model_format: Couldn't find device for GPU with id ",
+                        m_compiled_model_format_device_id);
+        // Set specified device info for compiled_model_format
+        model_format_map["DEVICE_ID"] = m_compiled_model_format_device_id;


why do we need to keep device_id ?
caching_properties already contain device architecture, which identifies the device we compile for

Is there any scenario that device_id changed with the same device if there are multiple GPUs?

We can use driver_info.vendor_id + driver_info.device_id to guarantee actual device not change?
That means if driver_info.vendor_id + driver_info.device_id is same, the device should have the same hardware configuration (cache size, EU number)?

Is there any scenario that device_id changed with the same device if there are multiple GPUs?

yes, if we have multiple discrete GPUs and one of them is removed from that machine between consequent runs.

We can use driver_info.vendor_id + driver_info.device_id to guarantee actual device not change?

We don't need to guarantee it in compile_model_format property. It's a responsibility of caching_properties, which holds ov::device::architecture, which does not depend on actual order of the devices.

caching_properties defines a list of properties, which describe device and hardcoded compilation options, which affect final blob and its hash

compiled_model_format holds the properties, which can later be changed in runtime (they are not hardcoded).

For example, we compiled out blob and later load it with:

Different OV runtime version. For plugins like CPU, GPU it's sensitive that OV runtime should be the same, because there are no guarantees that the same blob can be used with multiple OV versions.

Different driver version (read as execution runtime) for GPU, NPU devices. Drivers can be incompatible with compiled blobs compiled with other drivers (read compilers)

Maybe we need a better name for compiled_model_format to denote a list of properties, which can be changed in future runs and lead to compiled blob removal (because system is updated and new blob should be triggered).

compiled_model_format holds some runtime software dependencies for device model running, such as driver version, etc. While caching_properties should hold hardware information, such as architecture, EU number, right?

But we can see that in gpu plugin, caching_properties has contained ov::intel_gpu::driver_version, so seems that driver_version is unnecessary in compiled_model_format ?

openvino/src/plugins/intel_gpu/src/plugin/plugin.cpp

Lines 516 to 526 in 17008c9

std::vector<ov::PropertyName> Plugin::get_caching_properties() const {

static const std::vector<ov::PropertyName> caching_properties = {

ov::PropertyName{ov::device::architecture.name(), PropertyMutability::RO},

ov::PropertyName{ov::intel_gpu::execution_units_count.name(), PropertyMutability::RO},

ov::PropertyName{ov::intel_gpu::driver_version.name(), PropertyMutability::RO},

ov::PropertyName{ov::hint::inference_precision.name(), PropertyMutability::RW},

ov::PropertyName{ov::hint::execution_mode.name(), PropertyMutability::RW},

};

return caching_properties;

}

By the way, if we need a proper name for compiled_model_format, I vote compiled_model_runtime_properties

But we can see that in gpu plugin, caching_properties has contained ov::intel_gpu::driver_version

I suppose we need to move it from caching_properties to compiled_model_format. Otherwise, if driver version will be changed, blob hash is changed as well => blob will always be located on the file system and will not be removed by OpenVINO.
If we move driver_version to compiled_model_format, then hash is the same, but OV detects that compiled_model_format is changed => compiled blob must be dropped and regenerated.

compiled_model_format, I vote compiled_model_runtime_properties

I agree

ilya-lavrenov · 2023-12-13T07:41:51Z

src/plugins/intel_gpu/src/plugin/plugin.cpp

+        // Set specified device info for compiled_model_format
+        model_format_map["DEVICE_ID"] = m_compiled_model_format_device_id;
+        model_format_map["DRIVER_VERSION"] =
+            m_device_map.at(m_compiled_model_format_device_id)->get_info().driver_version;


but as we saw, driver version does not depend on actual device, because when you install driver on the system, it works for all devices.

You mean we need bind to the actual device, not driver version?

Actual device can be represented by driver_info.vendor_id + driver_info.device_id ?
So that we apply below 3 items:

vendor_id

device_id

driver_version

Does it make sense? @ilya-lavrenov

Please, have a look at https://github.com/openvinotoolkit/openvino/pull/21492/files#r1426253560
IMO we need to cover only driver version here, because device architecture is already a part of caching_properties.

ArtemySkrebkov

Looks good from functionality perspective. 👍

Looking forward to checking it out with NPU.

src/inference/src/dev/core_impl.cpp

ilya-lavrenov · 2023-12-19T08:48:21Z

src/inference/dev_api/openvino/runtime/internal_properties.hpp

+ *
+ * @ingroup ov_dev_api_plugin_api
+ */
+static constexpr Property<std::string, PropertyMutability::RO> compiled_model_runtime_properties{


it can contain a vector<string> and actual conversion to string can be done on Core level

May I know any benefits if we choose std::vector<std::string> rather than std::string?
From my understanding, core level doesn't need understand its contents and only read it from blob file's header or send it to plugin side, so std::string is enough, right?

IMO, it will simplify plugin code, because currently they have to parse / convert from / to string

In plugin side, we adopt ov::AnyMap and convert it to or from std::string, seems it is not too complex?

src/plugins/intel_gpu/src/plugin/plugin.cpp

p-durandin · 2023-12-19T15:17:19Z

Please fix ie_tests_win_cpu_vs2019_release test

github-actions bot added category: inference OpenVINO Runtime library - Inference category: GPU OpenVINO GPU plugin category: CPU OpenVINO CPU plugin labels Dec 6, 2023

riverlijunjie force-pushed the river/model_cache_format branch 3 times, most recently from 924fc2a to 824da2b Compare December 6, 2023 14:19

ilya-lavrenov self-assigned this Dec 6, 2023

riverlijunjie force-pushed the river/model_cache_format branch from 824da2b to 599496a Compare December 7, 2023 00:24

Generalize model cache reusing

599496a

ilya-lavrenov reviewed Dec 7, 2023

View reviewed changes

ilya-lavrenov requested review from vladimir-paramuzov and ArtemySkrebkov December 7, 2023 11:44

riverlijunjie force-pushed the river/model_cache_format branch 2 times, most recently from d073124 to cae554a Compare December 8, 2023 02:34

Use .as<std::AnyMap>() to convert std::string to std::AnyMap

cae554a

wangleis reviewed Dec 8, 2023

View reviewed changes

src/inference/src/dev/core_impl.cpp Show resolved Hide resolved

Merge branch 'master' into river/model_cache_format

b47d624

riverlijunjie marked this pull request as ready for review December 8, 2023 12:49

riverlijunjie requested review from a team as code owners December 8, 2023 12:49

riverlijunjie changed the title ~~[WIP]Generalize model cache reusing~~ Generalize model cache reusing Dec 8, 2023

riverlijunjie added 3 commits December 10, 2023 10:57

Merge branch 'master' into river/model_cache_format

363611b

Store driver version for different devices

cfe4680

Merge branch 'master' into river/model_cache_format

cf899c1

ilya-lavrenov reviewed Dec 13, 2023

View reviewed changes

ArtemySkrebkov approved these changes Dec 13, 2023

View reviewed changes

src/inference/src/dev/core_impl.cpp Outdated Show resolved Hide resolved

riverlijunjie force-pushed the river/model_cache_format branch from 78f639f to b5f0a31 Compare December 14, 2023 10:49

Use driver version as gpu compiled model runtime properties

b5f0a31

ilya-lavrenov approved these changes Dec 19, 2023

View reviewed changes

ilya-lavrenov assigned isanghao Dec 19, 2023

ilya-lavrenov added the Code Freeze label Dec 19, 2023

ilya-lavrenov added this to the 2023.3 milestone Dec 19, 2023

Merge branch 'master' into river/model_cache_format

004f8e4

e-ddykim reviewed Dec 19, 2023

View reviewed changes

src/plugins/intel_gpu/src/plugin/plugin.cpp Show resolved Hide resolved

ilya-lavrenov assigned e-ddykim Dec 19, 2023

e-ddykim approved these changes Dec 19, 2023

View reviewed changes

ilya-lavrenov enabled auto-merge (squash) December 19, 2023 10:53

wangleis approved these changes Dec 19, 2023

View reviewed changes

ilya-lavrenov merged commit 7b1074b into openvinotoolkit:master Dec 19, 2023
100 checks passed

riverlijunjie mentioned this pull request Dec 21, 2023

Add more tests for model cache new feature #21822

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generalize model cache reusing #21492

Generalize model cache reusing #21492

riverlijunjie commented Dec 6, 2023

ilya-lavrenov Dec 7, 2023

riverlijunjie Dec 8, 2023

ilya-lavrenov Dec 8, 2023

ilya-lavrenov Dec 11, 2023

riverlijunjie Dec 11, 2023

ilya-lavrenov Dec 13, 2023

riverlijunjie Dec 13, 2023 •

edited

Loading

riverlijunjie Dec 14, 2023

ilya-lavrenov Dec 14, 2023

riverlijunjie Dec 14, 2023

ilya-lavrenov Dec 14, 2023

ilya-lavrenov Dec 13, 2023

riverlijunjie Dec 13, 2023

riverlijunjie Dec 14, 2023

ilya-lavrenov Dec 14, 2023

ArtemySkrebkov left a comment

ilya-lavrenov Dec 19, 2023

riverlijunjie Dec 19, 2023

ilya-lavrenov Dec 19, 2023 •

edited

Loading

riverlijunjie Dec 19, 2023

p-durandin commented Dec 19, 2023

	std::vector<ov::PropertyName> Plugin::get_caching_properties() const {
	static const std::vector<ov::PropertyName> caching_properties = {
	ov::PropertyName{ov::device::architecture.name(), PropertyMutability::RO},
	ov::PropertyName{ov::intel_gpu::execution_units_count.name(), PropertyMutability::RO},
	ov::PropertyName{ov::intel_gpu::driver_version.name(), PropertyMutability::RO},
	ov::PropertyName{ov::hint::inference_precision.name(), PropertyMutability::RW},
	ov::PropertyName{ov::hint::execution_mode.name(), PropertyMutability::RW},
	};

	return caching_properties;
	}

Generalize model cache reusing #21492

Generalize model cache reusing #21492

Conversation

riverlijunjie commented Dec 6, 2023

Details:

Tickets:

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

riverlijunjie Dec 13, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ArtemySkrebkov left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ilya-lavrenov Dec 19, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

p-durandin commented Dec 19, 2023

riverlijunjie Dec 13, 2023 •

edited

Loading

ilya-lavrenov Dec 19, 2023 •

edited

Loading