Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARM: GCC optimal/generic architecture compiler flags #1974

Merged
merged 21 commits into from
Jan 6, 2017

Conversation

geimer
Copy link
Contributor

@geimer geimer commented Oct 31, 2016

This PR adds support for optimal/generic architecture compiler flags for the GCC compiler. This required basing the flag selection on the CPU architecture rather than the CPU family.

AArch32: The generic option assumes an ARMv7 platform. Using EasyBuild on older ARM architectures probably isn't of interest anyhow.

AArch64: Since the -mcpu=native option is only supported since GCC 6, a simple heuristic to target the detected CPU model for vanilla ARM cores has been implemented for older compilers. This is a "best effort" approach that may fail if the actual compiler does not (yet) support the corresponding CPU model. However, in this case the user still has the option to manually specify a reasonable option via --optarch=....

@geimer geimer mentioned this pull request Oct 31, 2016
4 tasks
systemtools.POWER: 'mcpu=generic-arch', # no support for -march on POWER
systemtools.AARCH32: 'mcpu=generic-armv7', # implies -march=armv7 and -mtune=generic-armv7
systemtools.AARCH64: 'mcpu=generic', # implies -march=armv8-a and -mtune=generic
systemtools.POWER: 'mcpu=generic-arch', # no support for -march on POWER
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JackPerdue Did you ever test this? I could not find any indication in the GCC manuals that generic-arch is a valid cpu_type value for the -mcpu option on POWER...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pescobar: it seems like you added this while implementing support for --optarch=GENERIC? cfr. 7f44bb6

Any pointers to where you got this from?

Copy link
Member

@boegel boegel Nov 25, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pescobar ping on the question above?

@JackPerdue
Copy link
Contributor

To be honest, I haven't looked at Power for a bit, but I
don't think that was me. e.g.

7f44bb6

I remember adding -mcpu=native but I don't recall adding
any generic-arch stuff.

Jack Perdue
Lead Systems Administrator
High Performance Research Computing
TAMU Division of Research
[email protected] http://hprc.tamu.edu
HPRC Helpdesk: [email protected]

On 10/31/2016 12:09 PM, geimer wrote:

@geimer commented on this pull request.


In easybuild/toolchains/compiler/gcc.py
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_hpcugent_easybuild-2Dframework_pull_1974-23pullrequestreview-2D6480259&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=yuyoBkmTkIQPbv1BTF9U27ww5Lm7GhsMmWcQG9gmjbA&m=Ap2hx1xXnf8wEpvBolJrZaSGNDm6vyj0PpGqK62Zpw8&s=kJyebYM8pVargcFQCQtjUSaaqS_M8-6MBslQUAqz-2o&e=:

 }
  # used with --optarch=GENERIC
  COMPILER_GENERIC_OPTION = {
  •    systemtools.AMD : 'march=x86-64 -mtune=generic',
    
  •    systemtools.INTEL : 'march=x86-64 -mtune=generic',
    
  •    systemtools.POWER: 'mcpu=generic-arch',  # no support for -march on POWER
    
  •    systemtools.AARCH32: 'mcpu=generic-armv7', # implies -march=armv7 and -mtune=generic-armv7
    
  •    systemtools.AARCH64: 'mcpu=generic',       # implies -march=armv8-a and -mtune=generic
    
  •    systemtools.POWER: 'mcpu=generic-arch',    # no support for -march on POWER
    

@JackPerdue
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JackPerdue&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=yuyoBkmTkIQPbv1BTF9U27ww5Lm7GhsMmWcQG9gmjbA&m=Ap2hx1xXnf8wEpvBolJrZaSGNDm6vyj0PpGqK62Zpw8&s=FIFWfVytAtVKOkJqfOlxIXkcWVYGcRwuCk2M7scPEBA&e=
Did you ever test this? I could not find any indication in the GCC
manuals that |generic-arch| is a valid /cpu_type/ value for the
|-mcpu| option on POWER...


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_hpcugent_easybuild-2Dframework_pull_1974-23pullrequestreview-2D6480259&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=yuyoBkmTkIQPbv1BTF9U27ww5Lm7GhsMmWcQG9gmjbA&m=Ap2hx1xXnf8wEpvBolJrZaSGNDm6vyj0PpGqK62Zpw8&s=kJyebYM8pVargcFQCQtjUSaaqS_M8-6MBslQUAqz-2o&e=,
or mute the thread
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AHdRz-2DG-5FYxI2ZFJcMUaDbMWmL7w1rN1yks5q5iDggaJpZM4KlQRR&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=yuyoBkmTkIQPbv1BTF9U27ww5Lm7GhsMmWcQG9gmjbA&m=Ap2hx1xXnf8wEpvBolJrZaSGNDm6vyj0PpGqK62Zpw8&s=TwQouzexWZNjbonnS4dXosorD97e692fIaFlIym9E4g&e=.

@JackPerdue
Copy link
Contributor

A quick test rebuilding HPL-2.1-foss-2016b.eb with --optarch=GENERIC shows a definite fail. I'm not used to using the GENERIC flag but, yes, GCC won't work on Power as is.

@JackPerdue
Copy link
Contributor

-mcpu=native -mtune=native might be the best way to go for now if the GENERIC needs to be differentiated in some way... but generic-arch is right out (as you suspected)

@JackPerdue
Copy link
Contributor

gcc: note: valid arguments to ‘-mtune=’ are: 401 403 405 405fp 440 440fp 464 464fp 476 476fp 505 601 602 603 603e 604 604e 620 630 740 7400 7450 750 801 821 823 8540 8548 860 970 G3 G4 G5 a2 cell e300c2 e300c3 e500mc e500mc64 e5500 e6500 ec603e native power3 power4 power5 power5+ power6 power6x power7 power8 powerpc powerpc64 rs64 titan

I don't have any suggestions beyond those not understanding what "generic" means on Power.

@boegel boegel added this to the v3.0 milestone Oct 31, 2016
@@ -84,14 +84,12 @@ class Clang(Compiler):

# used when 'optarch' toolchain option is enabled (and --optarch is not specified)
COMPILER_OPTIMAL_ARCHITECTURE_OPTION = {
systemtools.INTEL : 'march=native',
systemtools.AMD : 'march=native',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@geimer it seems unwise to drop these?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, wait, this relates to your change in compiler.py where you define self.arch differently...

# Construct -mcpu option
option = 'mcpu=%s' % '.'.join([i[1] for i in sorted_core_types])

self.COMPILER_OPTIMAL_ARCHITECTURE_OPTION[systemtools.AARCH64] = option
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@geimer don't we want some kind of warning in case option is an empty string here?

# big.LITTLE setups: sort core types to have big core (higher model number) first
core_types = [core.strip().lower() for core in cpu_model.split('+')]
sorted_core_types = sorted([(int(re.search('\d+', i).group(0)), i) for i in core_types],
reverse=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's lots that could go wrong here, and there's no error handling whatsoever; aren't we making too much assumption here on what get_cpu_model returns?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@boegel To be honest, I copy/paste-engineered this based on some code snippet I found on the web (StackOverflow, IIRC). I have no clue how to add error handling here...

Regarding your concerns about assumptions: This is certainly an issue, yes. However, I currently see no other way to optimize for the host architecture if -mcpu=native is not supported. If you have any better suggestions, please let me know.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it's mostly a case of error handling: what if our assumption is wrong? Then we should fall back to not picking a value for option (which we should rename to optarch_opt here?).

The only real issue/assumption you're making here is that i) get_cpu_model() returns something that starts with ARM, and ii) that we can find find a number in every item in core_types.

So, I think this should become something like this:

optarch_opt = ''
cpu_vendor, cpu_model = systemtools.get_cpu_vendor(), systemtools.get_cpu_model()
if cpu_vendor == systemtools.ARM and cpu_model.startswith('ARM '):
    self.log.debug("Determining architecture-specific compiler optimization flag for ARM (model: %s)", cpu_model)
    core_types = []
    for core_type in [ct.strip().lower() for ct in cpu_model[4:].split('+')]:
        # determine numeric id for each core type, since we need to sort them later numerically
        res = re.search('\d+$', core_type)  # note: numeric ID is expected at the end
        if res:
            core_id = int(res.group(0))
            core_types.append((core_id, core_type))
            self.log.debug("Extracted numeric ID for ARM core type '%s': %s", core_type, core_id)
        else:
            # if we can't determine numeric id, bail out
            core_types = None
            self.log.debug("Failed to extract numeric ID for ARM core type '%s', bailing out", core_type)
            break
    if core_types:
        # example: 'mcpu=cortex-a72.cortex-a53' for "ARM Cortex-A53 + Cortex-A72"
        optarch_opt = 'mcpu=%s' '.'.join([ct[1] for ct in sorted(core_types, reverse=True)])
        self.log.debug("Using architecture-specific compiler optimization flag '%s'", optarch_opt)

self.COMPILER_OPTIMAL_ARCHITECTURE_OPTION[systemtools.AARCH64] = option

sorted_core_types = sorted([(int(re.search('\d+', i).group(0)), i) for i in core_types],
reverse=True)
# Construct -mcpu option
option = 'mcpu=%s' % '.'.join([i[1] for i in sorted_core_types])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you provide a couple of examples in a comment of how what value would be specified to mcpu with this approach (just in case we get something entirely different at some point)?

@@ -146,8 +146,11 @@ def set_options(self, options):

def set_variables(self):
"""Set the variables"""
if self.arch is None:
self.arch = systemtools.get_cpu_architecture()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, aren't there any other locations where we rely on the .arch attribute of a Toolchain instance (potentially) that we should worry about?

playing the devil's advocate here, since this approach makes more sense anyway (defining self.arch with the result of get_cpu_family was already an indication that something wasn't 100% right...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@boegel As far as I could figure out using grep -r '\.arch' on framework, easyblocks, and easyconfigs, Compiler.arch (it's not defined in the Toolchain base class) is only used in tools/toolchain/compiler.py, in the toolchain unit tests, and now in toolchains/compiler/gcc.py.

BTW: Is there any reason why it isn't set in the constructor already? In this case, lazy initialization doesn't make much sense to me. But I may be overlooking something.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not seeing any other use of .arch elsewhere is good, but we don't control all easyblocks out there, and people have scripts around that hook into the EasyBuild framework, who knows...

That shouldn't hold us back from making this change though, since it makes a whole lot of sense, and if anyone relies on .arch somewhere, they probably shouldn't be anyway (they should call the appropriate function from systemtools instead).

I see no reason either for lazy init of self.arch, so feel free to change that.

@@ -339,6 +353,20 @@ def test_optarch_generic(self):
else:
self.assertFalse(generic_flags in tc.get_variable(var))

def test_optarch_aarch64_heuristic(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@geimer don't we want to test the default behaviour for AARCH64 as well here?

you can put a dummy module in place for GCC 6.x to check that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@boegel This is causing some trouble I don't understand... I've added a module file for GCCcore 6.2.0 and added

tc = self.get_toolchain("GCCcore", version="6.2.0")
tc.prepare()
self.assertEqual(...)

at the end of this test. But it is still giving me the modified mcpu flag based on the heuristic, even though it is definitely not taking the "modify option with heuristic" branch in _set_optimal_architecture. Why don't I get a fresh instance with COMPILER_OPTIMAL_ARCHITECTURE_OPTION initialized to the defaults???

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@geimer Hmm, you should get a fresh instance... Do make sure you also call tc.set_options() before tc.prepare() though, even if you're not setting any custom toolchain options. Maybe that explains it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@boegel Nope. Here is my current test (with a GCCcore/6.2.0 module in place):

    def test_optarch_aarch64_heuristic(self):
        """Test whether AArch64 pre-GCC-6 optimal architecture flag heuristic works."""
        st.get_cpu_architecture = lambda: st.AARCH64
        st.get_cpu_model = lambda: 'ARM Cortex-A53'
        st.get_cpu_vendor = lambda: st.ARM
        tc = self.get_toolchain("GCC", version="4.7.2")
        tc.prepare()
        self.assertEqual(tc.COMPILER_OPTIMAL_ARCHITECTURE_OPTION[tc.arch], 'mcpu=cortex-a53')

        st.get_cpu_model = lambda: 'ARM Cortex-A53 + Cortex-A72'
        tc = self.get_toolchain("GCC", version="4.7.2")
        tc.prepare()
        self.assertEqual(tc.COMPILER_OPTIMAL_ARCHITECTURE_OPTION[tc.arch], 'mcpu=cortex-a72.cortex-a53')

        tc = self.get_toolchain("GCCcore", version="6.2.0")
        tc.prepare()
        self.assertEqual(tc.COMPILER_OPTIMAL_ARCHITECTURE_OPTION[tc.arch], 'mcpu=native')

I've added

    print(self.COMPILER_OPTIMAL_ARCHITECTURE_OPTION[systemtools.AARCH64])

to _set_optimal_architecture in easybuild-framework/toolchains/compiler/gcc.py right at the beginning after the check that we're on AArch64 (i.e., before modifying the default value). And it prints

mcpu=native
mcpu=cortex-a53
mcpu=cortex-a72.cortex-a53

Calling tc.set_options({}) right before tc.prepare() in the test doesn't change anything. So it keeps the modified value even when creating a new instance. What am I doing wrong here? I'm confused...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@geimer Taking a look at this right now, I'll keep you posted.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for completeness sake: this problem was fixed in geimer#12

@@ -71,13 +71,11 @@ class IntelIccIfort(Compiler):

# used when 'optarch' toolchain option is enabled (and --optarch is not specified)
COMPILER_OPTIMAL_ARCHITECTURE_OPTION = {
systemtools.INTEL : 'xHost',
systemtools.AMD : 'xHost',
systemtools.X86_64: 'xHost',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@geimer hmm, I'm not so sure we want to drop the distinction between AMD and INTEL here... right now, it doesn't matter, but it may sometime in the future?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@boegel Hmm... for ARM, one needs to distinguish by architecture (as both AArch32 and AArch64 are of CPU family ARM), for X64_64 you want a distinction by CPU family. I can see various ways to address this:

  • Use the architecture as dictionary key (like now) and overwrite the value in _set_optimal_architecture if needed
  • Do it exactly the other way around (family as key, overwrite on architecture)
  • Use a tuple (architecture, family) as dictionary key

If you really want to keep the family distinction, I'm leaning towards the last option. What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going with the latter makes sense to me!

@boegel
Copy link
Member

boegel commented Nov 12, 2016

@geimer Looks like this will have to be moved to the 3.1 milestone, are you OK with that?

@boegel boegel modified the milestones: v3.1, v3.0 Nov 15, 2016
@geimer
Copy link
Contributor Author

geimer commented Nov 23, 2016

@boegel I still have no clue why this unit test is failing. I expect to get a fresh toolchain instance for each call to self.get_toolchain with the values of COMPILER_OPTIMAL_ARCHITECTURE_OPTION reset to the defaults. However, my debugging shows that this is not the case. Is there some kind of caching going on?

@boegel
Copy link
Member

boegel commented Nov 25, 2016

@geimer the problem you're running into is fixed with geimer#12 (see PR description for details)

@JackPerdue
Copy link
Contributor

Again... I don't know who added mcpu=generic-arch for POWER or why and I have no idea what GENERIC is supposed to be but --optarch=GENERIC on Power7 is broken in 3.0.... there is no "generic-arch" option for mcpu (see above).

@boegel
Copy link
Member

boegel commented Nov 25, 2016

@JackPerdue yeah, the plan is to fix that in this PR as well, since we're fiddling in that area.

Not quite sure what --optarch=GENERIC should imply on POWER(7) though...

We could just spit out an error stating this is not supported on POWER.

pass down default optarch value for AAarch64 & GCC 6.x rather than updating class constant
@geimer
Copy link
Contributor Author

geimer commented Dec 18, 2016

@JackPerdue, @boegel: Looking through the GCC manuals of various versions, these may be reasonable values for POWER:

-mcpu=powerpc, -mcpu=powerpc64, and -mcpu=powerpc64le specify pure 32-bit PowerPC (either endian), 64-bit big endian PowerPC and 64-bit little endian PowerPC architecture machine types, with an appropriate, generic processor model assumed for scheduling purposes.

As far as I can see, powerpc64le is supported since GCC 4.8, the other two were already available in the 3.x series. However, this means that we have to distinguish between big and little endian POWER (with everything up to POWER7 being big endian and POWER8 being little endian)

tc = self.get_toolchain("GCCcore", version="6.2.0")
tc.set_options({})
tc.prepare()
self.assertEqual(tc.COMPILER_OPTIMAL_ARCHITECTURE_OPTION[tc.arch], 'mcpu=native')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it OK to compare with tc.COMPILER_OPTIMAL_ARCHITECTURE_OPTION here? In the other cases we compare against tc.options.options_map['optarch'].

tc = self.get_toolchain("GCCcore", version="6.2.0")
tc.set_options({})
tc.prepare()
self.assertEqual(tc.COMPILER_OPTIMAL_ARCHITECTURE_OPTION[(tc.arch, tc.cpu_family)], 'mcpu=native')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... One should refresh the page before commenting ;-) So, again:

Is it OK to compare with tc.COMPILER_OPTIMAL_ARCHITECTURE_OPTION here? In the other cases we compare against tc.options.options_map['optarch'].

@geimer
Copy link
Contributor Author

geimer commented Dec 18, 2016

@boegel The failing test seems to be due to an inconsistent behavior of TCL modules. Any suggestions how to deal with this?

Edit: Maybe run self.modtool.available('GCC/'), i.e., using a trailing slash?

(currently not required due to the way it is implemented, but may cause
subtle errors in the future if the implementation changes)
rather than tc.COMPILER_OPTIMAL_ARCHITECTURE_OPTION
@geimer
Copy link
Contributor Author

geimer commented Dec 20, 2016

@boegel I believe this PR is ready for another review. However, there is still three issues:

  • One unit test is failing in Travis, but only with the pure TCL modules (looks like an inconsistent behavior). If you have a suggestion how to fix this, please let me know.
  • The --optarch=GENERIC thing on POWER should probably be addressed in a separate PR. See POWER: --optarch=GENERIC broken for GCC #2064.
  • A couple of unit tests are failing when run natively on ARM due to hard-coded Intel stuff. I'm up for tackling those, but again probably in a separate PR to keep things manageable. This is task 3 of Revise ARM platform support #1952.

@boegel
Copy link
Member

boegel commented Dec 23, 2016

@geimer fix for the broken test in geimer#13

fix broken test_avail, GCCcore is only included in result for Lmod/EnvironmentModulesC modules tools
(systemtools.AARCH64, systemtools.ARM): 'mcpu=native', # since GCC 6; implies -march=native and -mtune=native
(systemtools.POWER, systemtools.POWER): 'mcpu=native', # no support for -march on POWER; implies -mtune=native
(systemtools.X86_64, systemtools.AMD): 'march=native', # implies -mtune=native
(systemtools.X86_64, systemtools.INTEL): 'march=native', # implies -mtune=native
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@geimer this makes me wonder, should we also be using -mcpu=native on x86_84, since that implies both -march=native and -mtune=native for AARCH32?

Which one is closer to Intel's -xHost, mcpu=native or -march=native?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@boegel: These kind of options are a mess in GCC, as they are inconsistent between architectures... -march=native is the way to go on x86_64.

On x86_64, -march=native implies -mtune=native, but -mcpu is a deprecated synonym for -mtune. So using -mcpu=native would only tune for the host, but not enable all CPU features (as -march does).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@geimer That sounds like a mess indeed, thanks for clarifying!

# On AArch64, -mcpu=native is not supported prior to GCC 6. In this case, try to optimize for the detected
# CPU model (vanilla ARM cores only). Note that this heuristic may fail if the CPU model is not supported
# by the GCC version being used.
gcc_version = self.get_software_version(self.COMPILER_MODULE_NAME)[0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@geimer: did you ever test this with a GCC module that is actually a bundle of GCCcore and binutils?

As far as I can tell, grabbing the GCC version will fail in that case, as least with this current implementation (which only takes into account GCC as module name)...

Copy link
Member

@boegel boegel Jan 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see why this worked: we usually make sure in the GCC bundles that $EBROOTGCCCORE and $EBVERSIONGCCCORE are also defined for the GCCcore module, by defining altroot/altversion in the GCC bundle easyconfig, see https://github.com/hpcugent/easybuild-easyconfigs/blob/master/easybuild/easyconfigs/g/GCC/GCC-6.2.0-2.27.eb#L21

However, it would be better to take both GCCcore (first) and GCC (second)into account for determining the GCC version, like we already do for the installation prefix in_set_compiler_vars`.

Are you up for making that change @geimer?

edit: $EBROOTGCC and $EBVERSIONGCC => $EBROOTGCCCORE and $EBVERSIONGCCCORE

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@boegel: I have to admit that I don't remember whether I tested this with a GCC bundle. But...

As far as I can see, self.COMPILER_MODULE_NAME corresponds to GCC when using the GCC toolchain, and to GCCcore when using this toolchain. So it should actually do the right thing, no?

Copy link
Member

@boegel boegel Jan 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, no, since when GCC is being used as a toolchain, the compiler is actually provided via GCCcore, so we should grab the version directly from there (without relying on a GCC bundle also defining $EBROOTGCCCORE...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@boegel Just to clarify: You are proposing to use get_software_version from easybuild.tools.modules here rather than self.get_software_version, right? I don't think I understand the difference right now (it's been a long day...)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I propose to try obtaining the version using GCCcore first, and fall back to use GCC if that doesn't work. Whether you use self.get_software_version or get_software_version doesn't really matter here, that's only relevant when multiple modules are in play (self.get_software_version has a way of dealing with that).

So, something like this:

gcc_ver = get_software_version('GCCcore')
if gcc_ver is None:
    gcc_ver = get_software_version('GCC')

(switching to get_software_version makes sense here since we know we'll never be dealing with multiple names at the same time here)

cpu_vendor = systemtools.get_cpu_vendor()
cpu_model = systemtools.get_cpu_model()
if cpu_vendor == systemtools.ARM and cpu_model.startswith('ARM '):
self.log.debug("Determining architecture-specific compiler optimization flag for ARM (model: %s)",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@geimer while giving this one last review, this felt really heavy to keep it all under the same method, so I suggest we flesh out the AARCH64 specific stuff into a separate method, see geimer#14

# On AArch64, -mcpu=native is not supported prior to GCC 6. In this case, try to optimize for the detected
# CPU model (vanilla ARM cores only). Note that this heuristic may fail if the CPU model is not supported
# by the GCC version being used.
gcc_version = self.get_software_version(self.COMPILER_MODULE_NAME)[0]
Copy link
Member

@boegel boegel Jan 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see why this worked: we usually make sure in the GCC bundles that $EBROOTGCCCORE and $EBVERSIONGCCCORE are also defined for the GCCcore module, by defining altroot/altversion in the GCC bundle easyconfig, see https://github.com/hpcugent/easybuild-easyconfigs/blob/master/easybuild/easyconfigs/g/GCC/GCC-6.2.0-2.27.eb#L21

However, it would be better to take both GCCcore (first) and GCC (second)into account for determining the GCC version, like we already do for the installation prefix in_set_compiler_vars`.

Are you up for making that change @geimer?

edit: $EBROOTGCC and $EBVERSIONGCC => $EBROOTGCCCORE and $EBVERSIONGCCCORE

@boegel
Copy link
Member

boegel commented Jan 5, 2017

Looks good to go to me.

@geimer: let me know whether or not you want to verify whether Travis would've picked up the failing test you fixed with fc6cad1 before I merge this

@geimer
Copy link
Contributor Author

geimer commented Jan 6, 2017

@boegel If you are OK with knowing that the test result shown above for 27b5307 is bogus, please go ahead and merge. In the end, it's only the last test result that really matters.

@boegel
Copy link
Member

boegel commented Jan 6, 2017

@geimer I don't mind at all, I only really care about the test result of the last commit; but it's good to know Travis screws up if you push multiple commits quickly enough.

Going in, thanks!

@boegel boegel merged commit adaa182 into easybuilders:develop Jan 6, 2017
@geimer geimer deleted the arm_compiler_flags branch January 6, 2017 13:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants