Skip to content

Commit

Permalink
Add a flux component for LLNL
Browse files Browse the repository at this point in the history
Fine tuning of flux component
Fix a few minor issues with the initial cut:
* Job id could be obtained from the PMI kvsname like SLURM,
  but simpler to getenv (FLUX_JOB_ID)
* Flux pmi-1 doesn't define PMI_BOOL, PMI_TRUE, PMI_FALSE
* Flux pmi-1 maps the deprecated PMI_Get_kvs_domain_id() to
  PMI_KVS_Get_my_name() internally, so just call that instead.
* Drop residual slurm references.

Add wrappers for PMI functions so that if HAVE_FLUX_PMI_LIBRARY
is not defined, the component can dlopen libpmi.so at location
specified by the FLUX_PMI_LIBRARY_PATH env variable, which adds
flexibility.  If HAVE_FLUX_PMI_LIBRARY is defined, link with
libpmi.so at build time in the usual way.

Update configury for flux component

Update m4 so the configure options work as follows:

 --with-flux-pmi
      Build Flux PMI support (default: yes)

 --with-flux-pmi-library
      Link Flux PMI support with PMI library at build
      time. Otherwise the library is opened at runtime at
      location specified by FLUX_PMI_LIBRARY_PATH environment
      variable. Use this option to enable Flux support when
      building statically or without dlopen support (default: no)

If the latter option is provided, the library/header is located at
build time using the pkg-config module 'flux-pmi'.  Otherwise there
is no library/header dependency.

Handle the case where ompi is configured with --disable-dlopen
or --enable-statkc.  In those cases, don't build the component
unless --with-flux-pmi-library is provided.

It is fatal if the user explicitly requests --with-flux-pmi but
it cannot be built (e.g. due to --disable-dlopen).

Add a schizo/flux component

Update schizo/flux component

Eliminate slurm-specific usage cases.

Since the module is only loaded if FLUX_JOB_ID is set, there are
only two cases to handle:

1) App was launched indirectly through mpirun.  This is not yet
supported with Flux, but hook remains in case this mode is supported
in the future.

2) App was launched directly by Flux, with Flux providing
CPU binding, if any.

Fix up white space in pmix/flux component

Drop non-blocking fence from pmix:flux component

The flux PMI-1 library is not thread safe, therefore
register a regular blocking fence callback instead of the
thread-shifting fencenb().

pmix/flux component avoids extra PMI_KVS_Gets

Keys stored into the base cache under the wildcard
rank are not intended to be part of the global key namespace.
These keys therefore should not trigger a PMI_KVS_Get() if they
are not found in the cache.

Minor pmix/flux component cleanup

pmix/flux: drop code for fetching unused pmix_id

pmix/flux: err_exit must return error

Problem: in flux_init(), although 'ret' (variable holding
err_exit return code) is initialized to OPAL_ERROR, the
variable is reused as a temporary result code, so if there are
some successes followed by a failure that doesn't set 'ret',
flux_init() could return success with PMI not initialized.

Ensure that a "goto err_exit" returns OPAL_ERROR if 'ret'
is not set to some other error code.

pmix/flux: don't mix OPAL_ and PMI_ return codes

Problem: flux_init() can return both PMI_ and OPAL_ return
codes.  Although OPAL_SUCCESS and PMI_SUCCESS are both defined
as 0, other codes are not compatible.

Ensure that flux_init() consistently uses 'rc' for PMI_
return codes and 'ret' for OPAL_ return codes.

pmix/flux: factor out repeated code for cache put

Signed-off-by: Ralph Castain <[email protected]>
  • Loading branch information
Ralph Castain committed Dec 17, 2016
1 parent ced245d commit 215d629
Show file tree
Hide file tree
Showing 11 changed files with 1,248 additions and 0 deletions.
38 changes: 38 additions & 0 deletions opal/mca/pmix/flux/Makefile.am
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
#
# Copyright (c) 2014-2016 Intel, Inc. All rights reserved.
# $COPYRIGHT$
#
# Additional copyrights may follow
#
# $HEADER$
#

sources = \
pmix_flux.h \
pmix_flux_component.c \
pmix_flux.c

# Make the output library in this directory, and name it either
# mca_<type>_<name>.la (for DSO builds) or libmca_<type>_<name>.la
# (for static builds).

if MCA_BUILD_opal_pmix_flux_DSO
component_noinst =
component_install = mca_pmix_flux.la
else
component_noinst = libmca_pmix_flux.la
component_install =
endif

mcacomponentdir = $(opallibdir)
mcacomponent_LTLIBRARIES = $(component_install)
mca_pmix_flux_la_SOURCES = $(sources)
mca_pmix_flux_la_CPPFLAGS = $(FLUX_PMI_CFLAGS)
mca_pmix_flux_la_LDFLAGS = -module -avoid-version
mca_pmix_flux_la_LIBADD = $(FLUX_PMI_LIBS)

noinst_LTLIBRARIES = $(component_noinst)
libmca_pmix_flux_la_SOURCES =$(sources)
libmca_pmix_flux_la_CPPFLAGS = $(FLUX_PMI_CFLAGS)
libmca_pmix_flux_la_LDFLAGS = -module -avoid-version
libmca_pmix_flux_la_LIBADD = $(FLUX_PMI_LIBS)
63 changes: 63 additions & 0 deletions opal/mca/pmix/flux/configure.m4
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# -*- shell-script -*-
#
# Copyright (c) 2014-2016 Intel, Inc. All rights reserved.
# $COPYRIGHT$
#
# Additional copyrights may follow
#
# $HEADER$
#

# MCA_pmix_flux_CONFIG([action-if-found], [action-if-not-found])
# -----------------------------------------------------------
AC_DEFUN([MCA_opal_pmix_flux_CONFIG], [

AC_CONFIG_FILES([opal/mca/pmix/flux/Makefile])

AC_ARG_WITH([flux-pmi],
[AC_HELP_STRING([--with-flux-pmi],
[Build Flux PMI support (default: yes)])])

AC_ARG_WITH([flux-pmi-library],
[AC_HELP_STRING([--with-flux-pmi-library],
[Link Flux PMI support with PMI library at build time. Otherwise the library is opened at runtime at location specified by FLUX_PMI_LIBRARY_PATH environment variable. Use this option to enable Flux support when building statically or without dlopen support (default: no)])])


# pkg-config check aborts configure on failure
AC_MSG_CHECKING([if user wants Flux support to link against PMI library])
AS_IF([test "x$with_flux_pmi_library" != "xyes"],
[AC_MSG_RESULT([no])
$3],
[AC_MSG_RESULT([yes])
PKG_CHECK_MODULES([FLUX_PMI], [flux-pmi], [], [])
have_flux_pmi_library=yes
AC_DEFINE([HAVE_FLUX_PMI_LIBRARY], [1],
[Flux support builds against external PMI library])
])

AC_MSG_CHECKING([if Flux support allowed to use dlopen])
AS_IF([test $OPAL_ENABLE_DLOPEN_SUPPORT -eq 1 && test "x$compile_mode" = "xdso"],
[AC_MSG_RESULT([yes])
flux_can_dlopen=yes
],
[AC_MSG_RESULT([no])
])

AC_MSG_CHECKING([Checking if Flux PMI support can be built])
AS_IF([test "x$with_flux_pmi" != "xno" && ( test "x$have_flux_pmi_library" = "xyes" || test "x$flux_can_dlopen" = "xyes" ) ],
[AC_MSG_RESULT([yes])
opal_enable_flux=yes
],
[AC_MSG_RESULT([no])
AS_IF([test "x$with_flux_pmi" = "xyes"],
[AC_MSG_ERROR([Aborting since Flux PMI support was requested])
])
])

# Evaluate succeed / fail
AS_IF([test "x$opal_enable_flux" = "xyes"],
[$1
# need to set the wrapper flags for static builds
pmix_flux_WRAPPER_EXTRA_LIBS="$FLUX_PMI_LIBS"],
[$2])
])
7 changes: 7 additions & 0 deletions opal/mca/pmix/flux/owner.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
#
# owner/status file
# owner: institution that is responsible for this package
# status: e.g. active, maintenance, unmaintained
#
owner: INTEL
status: active
Loading

0 comments on commit 215d629

Please sign in to comment.