Complexity metric / checks must be performed on original source code #1547

ivangalkin · 2018-08-27T06:57:55Z

complexity metrics "Cognitive Complexity", "Cyclomatic Complexity" as well as derivate from cyclomatic complexity metrics "Complex Function (NR)", "Complex Functions (%)", "Complex Functions Lines of Code", "Complex Functions Lines of Code" and...
complexity checks "FunctionCognitiveComplexityCheck", "FunctionComplexity", "ClassComplexity", "FileComplexity"...

...are calculated/performed on preprocessed code. The preprocessor falsifies the code complexity

expansion of macros makes the code more complex
evaluation of preprocessor directives like #if #ifdef, #else etc makes the code less complex

Example of how macro expansion affects the check FunctionComplexity: EXPECT_EQ macro adds +3 extra cyclomatic complexity points because it's expanded as switch-case + if.

Possible solutions:

keep information, which AST parts were added while preprocessing (marking of some AST nodes as generated, helps only by macro expansions)
keep both original and preprocessor code for each compile unit (create two distinct ASTs)
perform some set of visitors before preprocessor and some set after it (there might be more visitors spoiled by preprocessor, e.g. the syntax-highlighting one)

Possible problems:

creation of AST for non-preprocessed (original) source code might be tricky: While macro expansion can be mapped to the function-invocation-AST (?), not-evaluation of preprocessor directives can lead to unparseable code (?)

The text was updated successfully, but these errors were encountered:

guwirth · 2018-08-27T07:53:40Z

@ivangalkin thx for investigations. Meanwhile I also tend to evaluate only "what you see later on in the source code view" (means without preprocessing the code).

perform some set of visitors before preprocessor and some set after it (there might be more visitors spoiled by preprocessor, e.g. the syntax-highlighting one)

@ALL: Where do you think do we still need preprocessed code?

jmecosta · 2018-08-27T08:59:00Z

I dont think so, does anyone know when do we need a "full" ast?

…

On Mon, 27 Aug 2018, 10:53 Günter Wirth, ***@***.***> wrote: @ivangalkin <https://github.com/ivangalkin> thx for investigations. Meanwhile I also tend to evaluate only "what you see later on in the source code view" (means without preprocessing the code). perform some set of visitors before preprocessor and some set after it (there might be more visitors spoiled by preprocessor, e.g. the syntax-highlighting one) @ALL <https://github.com/ALL>: Where do you think do we still need preprocessed code? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#1547 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AA_jyFzzh6WJONNnMHA6rQbf6625Io_Bks5uU6WGgaJpZM4WNRrY> .

* partial solution for SonarOpenCommunity#1547 MOTIVATION: * SonarOpenCommunity#1494 made visible the fact, that preprocessor affects the complexity metric. E.g. if function uses macros, the underlying (hidden) complexity of theses macros increases the complexity of the function. This is wrong because of many reasons: 1. Cognitive complexity: Macros do not increase the cognitive complexity of the source code. To the reader they are similar to the function invocations. Example ASSERT_XX/EXPECT_XX macros of GTest. 2. Cyclomatic complexity: Similar logic might be applied to the cyclomatic complexity: macro expansions are similar to the function calls (which become inlined). In this case the cyclomatic complexity of the caller should not be increased. [Cyclomatic] complexity of the callee must be calculated separately. 3. Preprocessor doesn't distinguish between the macros, which were declared inside of the analyzed project or were included from the external header. Obviously the external source code (e.g. again very complex GTest macros) should not affect the complexity of someone's custom SonarQube project. SOLUTION: In general it's impossible to build a valid C/C++ AST without performing of preprocessor step. The original source code might be syntactically incorrect (e.g. if macros are used to "invent" some domain specific language out of C/C++). All metrics of sonar-cxx are however AST-based. (This is a very questionably decision e.g. because #if/#ifdef etc might affect the resulting complexity or [NC]LOC in dramatic way). So we cannot skip preprocessor step entirely and/or perform metrics calculation/checks on the non-preprocessed code. Sonar SSLR has a possibility to mark some tokens as generated. This property (`Token::isGeneratedCode()`) is for example used in order to ignore generated while * [NC]LOC counting (see `AbstractLineLengthCheck<>`) * CDP analysis (see `CxxCpdVisitor`) or * Highlighting (see `CxxHighlighterVisitor`) This patch ensures, that * all `AstNode`s, which were created while macro expansion are marked as generated * all complexity related visitors ignore generated `AstNode`s RESULT: PRO: 1. The complexity of "callers" is not affected by macro anymore. This fact makes the complexity check look logical again. CON: 1. the complexity of "callees" is not calculated at all. This is correct in case of external macros, but wrong w.r.t. to the project intern macros. Possible solution: preprocessor must try to extract the function-like macro declaration into some special AST-subtree. Afterwards complexity metrics/check will be able to calculate/check macros as separate language construct. By the way: it must be analyzed if and how macros affect other metrics. For example LOC etc. 2. This patch might change the complexity related metrics. Depending on how many macros were used in the analyzed project, the reduction can be radical.

ivangalkin · 2018-09-05T14:17:07Z

@guwirth @guwirth I came to conclusion, that current visitors (all metric calculators and checks are AST visitors) cannot be applied to the original (non-preprocessed code). Original code might be non-parseable at all (there are far too many tricks one can make with macros and complier directives).

I suggest to mark macro expansion nodes as generated and ignore them by all complexity-related visitors. Please review #1552

guwirth · 2018-09-05T15:46:30Z

@ivangalkin if you test this in your own code base: how big is the difference in 'Measures/Complexity'?

ivangalkin · 2018-09-05T16:23:29Z

@guwirth we have a very GTest intensive code and the complexity decrease is about 30%
Previously each test case with 3 assertions was considered as too complex.

METRIC	BEFORE	AFTER
Cyclomatic Complexity	154 538	101 610
Cognitive Complexity	106 200	55 515
Complex Functions	1 484	614
Complex Functions (%)	2.4%	1.0%
Complex Functions Lines of Code	71 494	54 767
Complex Functions Lines of Code (%)	23.0%	17.7%

ivangalkin · 2018-09-11T09:00:11Z

Hi All,

update w.r.t. this issue:

MACRO

exclude macro expansions from complexity metrics/checks #1552 solves the macro expansions: complexity of the functions-code is measured now, not the complexity of the expanded AST
TODO code of macros is not analyzed while complexity calculation at all. See Handle macros like functions for metric calculation #1554

COMPILER DIRECTIVES

Previously I saw an issue in the metric calculation for the code like that:

#ifdef DEBUG
    <150 NCLOCs>
    <+50 complexity points>
#else
    <250 NCLOCs>
    <+150 complexity points>
#endif

Depending the -D, the NCLOC/complexity value was 150/50 or 250/150 respectively. This is quite wrong in my opinion, because it again measures the NCLOC/complexity of the AST rather than the metric of the source-code.

Meanwhile I tend to change my mind:

metric calculation of such branches is impossible in general, since they might contain an arbitrary text. Our analyzers are AST-based. Parsing a valid AST on non-preprocessed code is very tricky (if possible at all).
users are able to pass the -Ds as analysis parameter. So one can create separated projects if metrics for some specific compile variants are required. E.g. one for the debug and one for release code.

What do you think? Can we close the issue?

guwirth · 2018-10-06T11:26:36Z

Close it with #1552

ivangalkin added discussion waiting for answer labels Aug 27, 2018

ivangalkin mentioned this issue Sep 3, 2018

Nested macro parsing #1550

Closed

ivangalkin mentioned this issue Sep 5, 2018

exclude macro expansions from complexity metrics/checks #1552

Merged

guwirth assigned ivangalkin Oct 6, 2018

guwirth added this to the 1.2 milestone Oct 6, 2018

guwirth added enhancement and removed discussion waiting for answer labels Oct 6, 2018

guwirth closed this as completed Oct 6, 2018

guwirth mentioned this issue Nov 1, 2018

RFF: cxx plugin v1.2 & SQ7.4 #1579

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Complexity metric / checks must be performed on original source code #1547

Complexity metric / checks must be performed on original source code #1547

ivangalkin commented Aug 27, 2018

guwirth commented Aug 27, 2018

jmecosta commented Aug 27, 2018 via email

ivangalkin commented Sep 5, 2018

guwirth commented Sep 5, 2018

ivangalkin commented Sep 5, 2018

ivangalkin commented Sep 11, 2018

guwirth commented Oct 6, 2018

Complexity metric / checks must be performed on original source code #1547

Complexity metric / checks must be performed on original source code #1547

Comments

ivangalkin commented Aug 27, 2018

guwirth commented Aug 27, 2018

jmecosta commented Aug 27, 2018 via email

ivangalkin commented Sep 5, 2018

guwirth commented Sep 5, 2018

ivangalkin commented Sep 5, 2018

ivangalkin commented Sep 11, 2018

guwirth commented Oct 6, 2018