Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix building "generic" TRMM kernel with CMake #3067

Merged

Conversation

albertziegenhagel
Copy link
Contributor

The CMake "TARGET_CORE" variables stores the "generic" target name in all lowercase letters, but gets compared to an all uppercase string, which results in the wrong TRMM kernel being selected.

This MR converts the TARGET_CORE to all uppercase before comparing its value to make sure case mismatches are not an issue in the future anymore.

The selection of the wrong TRMM kernel can be seen when compiling OpenBLAS with CMake and the option -DTARGET=GENERIC.

In this case the tests sblat3, dblat3, cblat3 and zblat3 all fail with an error similar to the following:

 ******* FATAL ERROR - COMPUTED RESULT IS LESS THAN HALF ACCURATE *******
           EXPECTED RESULT   COMPUTED RESULT
       1      0.186813          0.373626
 ******* STRMM  FAILED ON CALL NUMBER:
    506: STRMM ('L','U','N','U',  1,  1, 1.0, A,  2, B,  2)        .

which is basically the same what could be observed in #2257.

The CMake "TARGET_CORE" variables stores the "generic" target name in all lowercase letters, but gets compared to an all uppercase string, which results in the wrong TRMM kernel being selected.
This commit converts the TARGET_CORE to all uppercase before comparing its value to make sure case mismatches are not an issue in the future anymore.
@martin-frbg martin-frbg added this to the 0.3.14 milestone Jan 14, 2021
@martin-frbg
Copy link
Collaborator

Oops - thanks.

@martin-frbg martin-frbg merged commit e378b24 into OpenMathLib:develop Jan 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants