Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DNNL] Add TensorRequisite concept. Multi instance support #11345

Merged
merged 1 commit into from
Jun 2, 2022

Conversation

apeskov
Copy link
Contributor

@apeskov apeskov commented May 17, 2022

Summary
There are several limitation which prevent DNNL runtime from using in multi instance mode. This patch tries to eliminate some of them.

Multi instance mode means: call Run() method concurrently from several threads for single instance of DNNLJSONRuntime.

Particularly changed for multi instance support:

  • Do not modify DNNLJSONRuntime fields from Run method. Make it like a "const" method.
  • Use explicit dnnl scratchpads where it requested.
  • Make Intermediate tensors collection individual for each thread.

Other improvements:

  • Zero copy handling of Input/Output tensors
  • Use query API to ask DNNL about desired layouts. Prevent from using of unoptimised kernels.
  • Automatic injection or reorder primitives if layout doesn't match.
  • Support of different data types (int8, uint8, int32). Eliminate all "fp32" hardcoded values.

Details
Introduced indirect handling of memory objects. New objects are TensorRequisite and TensorRegistry.
TensorRequisite - describe sequence of transformation on top of some source tensor.
TensorRegistry - implement matching of TensorRequisite described tensor and real dnnl::memory objects.

This concept allows to:

  • Decouple primitives arguments and real memory object. Matching of arguments to memory objects happens on demand depending on contexts of execution (thread id, arguments of method Run).
  • Don't care about nature of tensor. Constant weights, intermediate tensors and external input/output processed identically.

Some pseudo code to demonstrate concept:

DLTensor src_mem = {5, 2, 128, 128, 8} // 5D tensor

// Describe sequence of layout transformation
auto tr = TensorRequisite.AsIs(src_mem, eid);  // 5D
tr = tr.treatAs("ABCD8b");  // 4D
tr = tr.permute({0, 2, 3, 1});  // permute axes NCHW -> NHWC
tr = tr.crop({1, 128, 128, 16}, {0, 0, 0});  // extract first batch
tr = tr.squeeze();

// register TR
TensorRegistry t_reg;
auto t_id = t_reg.register(tr);

// Obtain dnnl::memory object inside of method Run()
auto solver = t_reg.makeSolver(ext_io_provider);
auto mem = solver(t_id);

@apeskov
Copy link
Contributor Author

apeskov commented May 17, 2022

@masahi Please take a look. This is a part of changes extracted from #9618.

@masahi masahi self-assigned this May 17, 2022
@masahi
Copy link
Member

masahi commented May 17, 2022

Please fix the build.

@apeskov apeskov force-pushed the ap/dnnl-tensor-requisite branch 2 times, most recently from cca999f to 03726da Compare May 18, 2022 21:36
@masahi
Copy link
Member

masahi commented May 20, 2022

Can you make the coding style consistent with the rest of the codebase? We are not supposed to use camelCase. I can tolerate if there are only a few of them, but I feel this change contains too much of your own style.

@apeskov
Copy link
Contributor Author

apeskov commented May 25, 2022

Yes, I can. Could you please indicate particular places you treat as violation of coding style? I was trying to keep original code style as much as possible.

We are not supposed to use camelCase

This statement slightly contradicts with TVM recommendation (Google style) and other TVM code base. Do you mean I should use UpperCamelCase instead of "lowerCamelCase"?

@masahi
Copy link
Member

masahi commented May 25, 2022

Yes, sorry I meant we should use UpperCamelCase instead of lowerCamelCase. I saw many lowerCamelCase in this PR. Since this is a runtime code for one backend and not like the core code, I'd say we don't have to be too strict with this rule. But in general, I think it's better to stick with the convention, unless there is a good reason to deviate (e.g. match coding style with dnnl etc).

@masahi
Copy link
Member

masahi commented May 26, 2022

There is a conflict due to the merge of #11111

@apeskov apeskov force-pushed the ap/dnnl-tensor-requisite branch from 0372e45 to 84e221e Compare May 31, 2022 14:17
@apeskov
Copy link
Contributor Author

apeskov commented May 31, 2022

@masahi, I'm slightly confused with coding style.
In PR #11111 you approved a new function dtype_dl2dnnl. This is definitely not a CamelCase naming.

In my patch I would like to add several more similar util functions. And what naming I should use? snake_case like previous one or CamelCase like you requested?

@masahi
Copy link
Member

masahi commented Jun 1, 2022

Yeah, how about this: I don't think using CamelCase should be mandatory for all functions. As a general rule of thumb, we should use CamelCase by default, but if a person is aware of this general convention but still wants to use snake_case for some reasons, I'd say go ahead. In particular, since DNNL uses snake_case, I understand that making some util functions be snake_case can make the code more natural.

I brought this up just because I saw a lot of uses of the style like makeSolver in this PR, which seemed arbitrary.

Allow to use DNNL runtime in multi instance mode.
Thread safe execution of Run() method.

Signed-off-by: Alexander Peskov <[email protected]>
@apeskov apeskov force-pushed the ap/dnnl-tensor-requisite branch from 84e221e to ee39484 Compare June 1, 2022 18:50
@apeskov
Copy link
Contributor Author

apeskov commented Jun 2, 2022

@masahi Thank you for explanation about mandatory and recommendation part of code style.
PR turned green. You can continue reviewing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants