-
Notifications
You must be signed in to change notification settings - Fork 866
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix --with-hwloc=external #2955
Conversation
I neglected to mention in the PR description: there is a lengthy explanation of this approach in the commit messages. The short version: allow frameworks to have an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
at first glance, this is a more elegant approach.
we can likely get rid of the MCA_hwloc_external_header
macro (including some configury logic), since we should now be able to simply
#include <hwloc.h>
i will make some more thorough testing on Monday
opal/mca/hwloc/Makefile.am
Outdated
@@ -10,7 +10,7 @@ | |||
# | |||
|
|||
# We do not want -I$(srcdir) in AM_CPPFLAGS, or there can be a | |||
# conflict between system hwloc.h and opal/mca/hwloc/hwloc.h. So just | |||
# conflict between system hwloc.h and opal/mca/hwloc/hwloc-internal.h. So just |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we still need to set AM_CPPFLAGS
here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're exactly right -- we don't. I just noticed this when one of the CI bots failed "make dist". I have a fix (where we don't set AM_CPPFLAGS
at all), but I'm doing a little more testing before I push it up here.
After some thought, I think we still need |
19451da
to
5023c92
Compare
makes sense. #include "/include/hwloc.h" |
I don't think so:
/* Location of external hwloc header */
#define MCA_hwloc_external_header "hwloc.h"
Am I missing something? |
@jsquyres you are good, and since git blame points to me, i guess i forgot i fixed this part ... |
I think you guys are swatting mosquitos with sledgehammers. Revamping the MCA system and modifying the headers used across the code base to work around a self-created problem seems extreme. This isn't an inherent issue in the config code - this is a problem caused by a design decision we made with respect to how the user specifies a path to an external hwloc. We chose, for whatever reason, to do this in an atypical way that is now biting us. It therefore seems to me that we have two simpler solutions to these problems:
Whatever we do, let's be fully aware of the extent of the problem caused by this external vs internal dependency. I think you have underestimated the extent of the problem - e.g., you don't appear to remember that specifying --with-hwloc= So why not get rid of this embedded code and save ourselves some pain? |
@rhc54 are you suggesting we simply remove the embedded hwloc component i guess one of the main issue with |
Yes - remove the hwloc components (we still need the base functions), have a config check that rejects hwloc 2.x, and make the user install what we need. Why treat hwloc different from any other required support? As multiple users have pointed out, there is no issue with setting |
this is system dependent, and you described the (potential) issue earlier. |
If pmix is available on the system, then we should use the system one too - again, there is no reason for us to continue treating these packages differently from any other ones. We can have "glue" components for the different versions - just drop the embedding. |
@rhc54 the default assumption of |
If it is broken, then it needs to be fixed regardless of the path forward - that's a standard use-case we always support. |
@rhc54 and I talked about this at length this morning on the phone. The proposed fix on this PR does, indeed, address the issue. It touches a bunch of files, but in trivial ways ( But it feels like a band-aid. Maybe we should stop embedding hwloc. Here's the issues:
|
With some testing, I have determined:
|
Instead, include "opal/mca/hwloc/hwloc.h" Signed-off-by: Jeff Squyres <[email protected]>
Frameworks are usually required to have a framework/framework.h file. However, this is sometimes problematic (see the hwloc use case/problem description, below). This commit allows frameworks to have an "autogen.options" file (i.e., project/mca/framework/autogen.options) that specifies things that autogen needs to know about the framework. Currently, the only option recognized in autogen.options is "framework_header", which allows a framework to specify that its header file is named something other than "framework.h" (the framework header file must still be in the project/mca/framework directory; it simply may be named something other than framework.h). More options may be introduced over time. The use case that motivated this is the hwloc framework (open-mpi#2616). Per MCA framework rules, the hwloc framework is required to have an opal/mca/hwloc/hwloc.h file. However, the hwloc library itself *also* has an hwloc.h file. This causes a problem when configuring Open MPI with --with-hwloc=external (meaning: do not use the hwloc embedded within the Open MPI source code tree -- instead, use an hwloc installation from outside the Open MPI source code tree). Specifically, when in the opal/mca/hwloc directory, the presence of "-I." in DEFAULT_INCLUDES (put there by Automake) causes a confusion between the hwloc.h in opal/mca/hwloc/hwloc.h and the system-installed hwloc.h. Chaos ensues (see the GitHub issue for more detail). The solution is to rename the opal/mca/hwloc/hwloc.h to something else (e.g., hwloc-internal.h), and extend autogen.pl to allow frameworks to have an alternate name for their framework header file. This commit introduces the autogen.pl mechanism to allow the alternate header file name. A follow-on commit will effect this change in the hwloc framework (and update all the places in the code base to use the new filename). Signed-off-by: Jeff Squyres <[email protected]>
5023c92
to
30a3f68
Compare
Per a prior commit, the presence of "hwloc.h" can cause ambiguity when using --with-hwloc=external (i.e., whether to include opal/mca/hwloc/hwloc.h or whether to include the system-installed hwloc.h). This commit: 1. Renames opal/mca/hwloc/hwloc.h to hwloc-internal.h. 2. Adds opal/mca/hwloc/autogen.options to tell autogen.pl to expect to find hwloc-internal.h (instead of hwloc.h) in opal/mca/hwloc. 3. s@opal/mca/hwloc/hwloc.h@opal/mca/hwloc/hwloc-internal.h@g in the rest of the code base. Signed-off-by: Jeff Squyres <[email protected]>
30a3f68
to
fec519a
Compare
go with it 👍 |
@rhc54 and I discussed -- we agree on the long term:
|
This is an alternate approach to #2954 that I came up with while thinking about this problem overnight. It should fix #2616.
FYI: @ggouaillardet @artpol84 @opoplawski @dannyauble