Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cpd algorithm: what is a duplicate? - please vote #946

Closed
guwirth opened this issue Sep 11, 2016 · 4 comments · Fixed by #953
Closed

cpd algorithm: what is a duplicate? - please vote #946

guwirth opened this issue Sep 11, 2016 · 4 comments · Fixed by #953

Comments

@guwirth
Copy link
Collaborator

guwirth commented Sep 11, 2016

Hi everyone, with #941 the CPD sensor is using a new algorithm:

  • ignoreLiterals: CPD ignores literal value differences when evaluating a duplicate block. This means that foo=42; and foo=43; will be seen as equivalent.
  • ignoreIdentifiers: Similar to ignoreLiterals but for identifiers; i.e., variable names, methods names, and so forth
  • like http://pmd.sourceforge.net/pmd-4.3.0/cpd.html

This approach detects much more duplicates - maybe too much? Like to ask you to vote, What's a duplicate in your opinion:

Example 1:

bits unixtime1(bits ld1, bits ex1)
{
    ld1 &= 0xFF;
    ld1 -= 51;
    if (ex1 < 1855547904U) ld1--;
    ex1 -= 1855548004U;
    return ex1 / 100 + 42949673U * ld1 - ld1 / 25;
}

bits unixtime2(bits ld2, bits ex2)
{
    ld2 &= 0xFF;
    ld2 -= 51;
    if (ex2 < 1855547904U) ld2--;
    ex2 -= 1855548004U;
    return ex2 / 100 + 42949673U * ld2 - ld2 / 25;
}

Example 2:

int object_exists1(char *fn)
{
    int ob;
    if (xosfile_read_stamped_no_path(fn, &ob, 0, 0, 0, 0, 0)) return 0;
    switch (ob)
    {
    case osfile_IS_FILE:return 1;
    case osfile_IS_DIR:return 1;
    case osfile_IS_IMAGE:return 1;
    }
    return 0;
}

int object_exists2(char *fn)
{
    int ob;
    if (xosfile_read_stamped_no_path(fn, &ob, 1, 1, 1, 1, 1)) return 1;
    switch (ob)
    {
    case osfile_IS_FILE:return 2;
    case osfile_IS_DIR:return 2;
    case osfile_IS_IMAGE:return 2;
    }
    return 0;
}

Example 3:

char* tostring1(int value)
{
    switch (value)
    {
    case 0: return "zero";
    case 1: return "one";
    }
    return "undefined";
}

char* tostring2(int value)
{
    switch (value)
    {
    case 2: return "two";
    case 3: return "three";
    }
    return "undefined";
}

Example 4:

int toint1(char value)
{
    switch (value)
    {
    case '0:' return 0;
    case '1': return 1;
    }
    return -1;
}

int toint2(char value)
{
    switch (value)
    {
    case '2:' return 2;
    case '3': return 3;
    }
    return -1;
}

Example 5:

typedef enum ABC {
    ABC           = 12,
    DEF           = 13,
} ABC_e;

typedef enum DEF {
    ABC           = 14,
    DEF           = 15,
} DEF_e;

Example 6:

typedef enum ABC1 {
    ABC1           = 12,
    DEF1           = 13,
} ABC1_e;

typedef enum ABC2 {
    ABC2           = 12,
    DEF2           = 13,
} ABC2_e;

Please vote

@jmecosta
Copy link
Member

@guwirth why not provide flags to enable/disable all those options as PMD has? i can see some benift of having those in some projects.

@guwirth
Copy link
Collaborator Author

guwirth commented Sep 12, 2016

@jmecosta it's no problem to add some additional configuration settings:

sonar.cxx.cpd.ignoreLiterals=true/false
sonar.cxx.cpd.ignoreIdentifiers=true/false

Nevertheless would be interesting to know your opinion about the cases above to set the defaults. In my opinion the default should be:

sonar.cxx.cpd.ignoreLiterals=false
sonar.cxx.cpd.ignoreIdentifiers=true

@jmecosta
Copy link
Member

@guwirth i can ask here to a few developers what they think. and let you know

@jmecosta
Copy link
Member

So think 1, 6 should be enabled by default, and perhaps 2. So i supose in our case we would use
sonar.cxx.cpd.ignoreLiterals=true in order to ignore example 5?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

2 participants