[microNPU][2c] Add performance modelling to cascader #9778

jacobbohlin · 2021-12-20T21:11:30Z

NOTE: This PR builds on top of #9469 and #9471 and therefore includes those changes. This PR will remain as 'draft' until both dependencies are merged.

The algorithm described in the RFC uses two metrics for pareto culling, performance and memory usage. This commit addresses the former and introduces the basis of performance estimation for the Parts. It also includes performance estimation code that is specific to ethosu_conv2d.

The output of the performance model is only meant to be consumed by the cascader.

jacobbohlin · 2021-12-20T21:12:55Z

@mbaret @manupa-arm @NicolaLancellotti

mbaret · 2022-01-12T14:05:08Z

src/contrib/ethosu/cascader/parts/ethosu.cc

-    if (!is_rolling) {
-      num_blocks *= output_stripe_config->GetShape()[i] * output_stripe_config->GetStripes()[i] /
+    if (buffer_mode == BufferMode::RECOMPUTE) {
+      num_blocks *= static_cast<float>(output_stripe_config->GetShape()[i] *
+                                       output_stripe_config->GetStripes()[i]) /
                    block_shape[i];
    } else {
-      num_blocks *= output_stripe_config->GetExtent()[i] / block_shape[i];
+      num_blocks *= static_cast<float>(output_stripe_config->GetExtent()[i]) / block_shape[i];
    }
  }


Just to mention that this logic is placeholder and will be replaced in a later patch.

mbaret · 2022-01-12T15:16:38Z

Just a quick note on the test coverage of this feature. The results of the performance model are not explicitly tested against the FVP because we don’t have performance instrumentation available in CI. We will however be testing this component downstream where such instrumentation is available.

* Added the pre-computed performance modelling per block. * Added the aggregation of cycles given a stripe config. * Implemented the op-specific performance code for conv2d. * Created a DeviceConfig class to hold constant performance related data that is dependent on the accelerator configuration * Added generation of all valid block configs. This is pre-computed and given as an argument when constructing EthosuParts. * Implemented selection of the block config that gives the least amount of data read given a StripeConfig.

mbaret · 2022-01-17T16:17:02Z

cc @manupa-arm could you take a look and merge if everything's OK? Thanks

manupak

LGTM

manupak · 2022-01-17T16:24:49Z

Thanks! @jacobbohlin @mbaret

* [microNPU][2c] Initial Performance Model * Added the pre-computed performance modelling per block. * Added the aggregation of cycles given a stripe config. * Implemented the op-specific performance code for conv2d. * Created a DeviceConfig class to hold constant performance related data that is dependent on the accelerator configuration * Added generation of all valid block configs. This is pre-computed and given as an argument when constructing EthosuParts. * Implemented selection of the block config that gives the least amount of data read given a StripeConfig. * Add test guards * Extended block config testing

jacobbohlin mentioned this pull request Dec 21, 2021

[microNPU][2d] Add more Part matchers to cascader #9785

Merged

mbaret force-pushed the ethosu-cascader-4 branch 2 times, most recently from 77a5b22 to 366aa80 Compare December 22, 2021 10:54

mbaret force-pushed the ethosu-cascader-4 branch 2 times, most recently from 1124baf to 35d9164 Compare January 4, 2022 15:14

mbaret mentioned this pull request Jan 5, 2022

[Tracking Issue] Arm(R) Ethos(TM)-U Cascading Scheduler #9429

Closed

12 tasks

jacobbohlin mentioned this pull request Jan 5, 2022

[microNPU][2b] Create CascaderGraphs from TE graphs #9471

Merged

mbaret force-pushed the ethosu-cascader-4 branch from 35d9164 to 0fa664d Compare January 10, 2022 16:36

jacobbohlin force-pushed the ethosu-cascader-4 branch 2 times, most recently from 6366031 to 426d0ae Compare January 11, 2022 14:36

jacobbohlin marked this pull request as ready for review January 11, 2022 14:42

jacobbohlin requested review from anijain2305, areusch, comaniac, icemelon, jroesch, junrushao, jwfromm, MarisaKirisame, mbrookhart, merrymercy, slyubomirsky, tqchen, vinx13, wweic, yzhliu, zhiics and ZihengJiang as code owners January 11, 2022 14:42

jacobbohlin force-pushed the ethosu-cascader-4 branch from 426d0ae to c4a4b5a Compare January 12, 2022 11:27

mbaret reviewed Jan 12, 2022

View reviewed changes

jacobbohlin added 3 commits January 17, 2022 11:20

Add test guards

13f2eb6

Extended block config testing

e1daf76

jacobbohlin force-pushed the ethosu-cascader-4 branch from be05493 to e1daf76 Compare January 17, 2022 10:20

mbaret approved these changes Jan 17, 2022

View reviewed changes

manupak approved these changes Jan 17, 2022

View reviewed changes

manupak merged commit 133bb9c into apache:main Jan 17, 2022

driazati mentioned this pull request Jul 14, 2022

TVM v0.9.0.rc0 Release Candidate Notes #12102

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[microNPU][2c] Add performance modelling to cascader #9778

[microNPU][2c] Add performance modelling to cascader #9778

jacobbohlin commented Dec 20, 2021

jacobbohlin commented Dec 20, 2021

mbaret Jan 12, 2022

mbaret commented Jan 12, 2022

mbaret commented Jan 17, 2022

manupak left a comment

manupak commented Jan 17, 2022

[microNPU][2c] Add performance modelling to cascader #9778

[microNPU][2c] Add performance modelling to cascader #9778

Conversation

jacobbohlin commented Dec 20, 2021

jacobbohlin commented Dec 20, 2021

mbaret Jan 12, 2022

Choose a reason for hiding this comment

mbaret commented Jan 12, 2022

mbaret commented Jan 17, 2022

manupak left a comment

Choose a reason for hiding this comment

manupak commented Jan 17, 2022