Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sched-simple issues with quick start on 0.8.0 release #747

Closed
rspavel opened this issue Sep 4, 2020 · 9 comments · Fixed by #757
Closed

sched-simple issues with quick start on 0.8.0 release #747

rspavel opened this issue Sep 4, 2020 · 9 comments · Fixed by #757

Comments

@rspavel
Copy link

rspavel commented Sep 4, 2020

Trying to follow the quick start guide. In the process of working through issues on a CTS platform (either spack is providing a dirty pkgconf environment or the flux-core and flux-sched autotools setup is reading things incorrectly and I need to debug that) but was able to get a completed install on a dev cluster as per on a fairly recent spack hash with

spack install [email protected]%[email protected]
# [email protected]%[email protected]~cuda~docs arch=linux-centos7-haswell
# [email protected]%[email protected]~cuda arch=linux-centos7-haswell

flux keygen works but

[rspavel@cn103 fluxTesting]$ flux start --size=4
2020-09-04T15:48:11.826345Z broker.err[0]: rc1: flux-module: cmb.rmmod sched-simple: No such file or directory
2020-09-04T15:48:11.827180Z broker.err[0]: Run level 1 Exited with non-zero status (rc=1) 2.7s
2020-09-04T15:48:12.190757Z broker.err[0]: rc3: flux-module: cmb.rmmod qmanager: No such file or directory
2020-09-04T15:48:12.226114Z broker.err[0]: rc3: flux-module: cmb.rmmod resource: No such file or directory
flux-start: 0 (pid 14789) exited with rc=1

when testing from within a slurm allocation.

It looks like this was a known issue as of #525 and a workaround exists. But I am unable to find said workaround and was instructed to make a new issue

@dongahn
Copy link
Member

dongahn commented Sep 4, 2020

@rspavel:

a CTS platform
Just curious, is this LLNL's machine?

spack install [email protected]%[email protected]

This is a pretty old version. Can you try the latest version available through Spack? @SteVwonder, can you suggest one for @rspavel?

@rspavel
Copy link
Author

rspavel commented Sep 4, 2020

No. the CTS machine I am referring to is at LANL but my thought process was that it is probably the closest to an LLNL environment that I can work on with this charge code.

@0.8.0 is the most recent version in Spack as of yesterday-ish. I've fiddled with @master a bit but that runs into compilation issues that I assume the team is aware of. I THINK I manually added an intermediate version on a previous attempt to use flux but ran into dependencies issues. But that was order some unit of time ago.

If there is a known good tagged release with a known good set of dependencies I can update the spackage. Just currently debugging multiple directions of a code I don't understand so I am not too eager to propagate my kludges.

It isn't the best from a software engineering or development standpoint but it sounded like there was a quick fix or config file that would let me make progress from a user standpoint without grabbing my build systems and spack hat.

@dongahn
Copy link
Member

dongahn commented Sep 4, 2020

@rspavel: good to know this is for LANL use. Given how similar your environment is with LLNL, we should be able to walk you through to make Flux available on LANL. Let me discuss this with the team at today's coffee hour (2PM PDT) and get back to you.

@SteVwonder
Copy link
Member

Hey @rspavel! Good to see you outside of UD.

If you want to keep pushing forward with 0.8.0, I think you will want to cherry-pick this change: b11f4cc (#619). It only affects the etc/rc scripts, so you can make the change directly to the spack-installed flux-sched etc files.

W.r.t. a newer version of flux with Spack, it should "just work" to do spack install [email protected] ^[email protected] since we've coded Flux's release pattern into the Spack package and haven't added any new dependencies in a while. I have on my TODO list to update the spack package with the latest releases, but it keeps getting overtaken by other TODOs 🙈

@rspavel
Copy link
Author

rspavel commented Sep 9, 2020

@SteVwonder yeah, small world.

Bumped up to [email protected] ^[email protected] and prepping a PR to spack for when I can verify it but that gets me to the aforementioned boost issue along the lines of

     681          | ^~~~~~~~~~~~~~~~~~~~~~~
     682      CXX      evaluators/libresource_la-edge_eval_api.lo
     683    In file included from ${SPACK_DIR}/opt/spack/linux-centos7-haswell/gcc-9.3.0/boost-1.74.0-ntwkp2xfc4s4yfihf3vufjlcyxwi7xb7/include/boost/graph/detail/adjacency_list.hpp:34,
     684                     from ${SPACK_DIR}/opt/spack/linux-centos7-haswell/gcc-9.3.0/boost-1.74.0-ntwkp2xfc4s4yfihf3vufjlcyxwi7xb7/include/boost/graph/adjacency_list.hpp:255,
     685                     from /tmp/rspavel/spack-stage/spack-stage-flux-sched-0.11.0-bkvpissxa7rda2jsntnxmj3etqp7wzxv/spack-src/resource/readers/resource_spec_grug.cpp:26:
     686    ${SPACK_DIR}/opt/spack/linux-centos7-haswell/gcc-9.3.0/boost-1.74.0-ntwkp2xfc4s4yfihf3vufjlcyxwi7xb7/include/boost/graph/detail/adj_list_edge_iterator.hpp: In member function 'void boost::vec_adj_list_impl<Graph, Conf
            ig, Base>::copy_impl(const boost::vec_adj_list_impl<Graph, Config, Base>&) [with Graph = boost::adjacency_list<boost::vecS, boost::vecS, boost::directedS, Flux::resource_model::resource_pool_gen_t, Flux::resource_model::relation_gen_t>; Config =
             boost::detail::adj_list_gen<boost::adjacency_list<boost::vecS, boost::vecS, boost::directedS, Flux::resource_model::resource_pool_gen_t, Flux::resource_model::relation_gen_t>, boost::vecS, boost::vecS, boost::directedS, Flux::resource_model::re
            source_pool_gen_t, Flux::resource_model::relation_gen_t, boost::no_property, boost::listS>::config; Base = boost::directed_graph_helper<boost::detail::adj_list_gen<boost::adjacency_list<boost::vecS, boost::vecS, boost::directedS, Flux::resource_
            model::resource_pool_gen_t, Flux::resource_model::relation_gen_t>, boost::vecS, boost::vecS, boost::directedS, Flux::resource_model::resource_pool_gen_t, Flux::resource_model::relation_gen_t, boost::no_property, boost::listS>::config>]':
  >> 687    ${SPACK_DIR}/opt/spack/linux-centos7-haswell/gcc-9.3.0/boost-1.74.0-ntwkp2xfc4s4yfihf3vufjlcyxwi7xb7/include/boost/graph/detail/adj_list_edge_iterator.hpp:80:13: error: '*((void*)& ei +48)' may be used uninitialized i
            n this function [-Werror=maybe-uninitialized]
     688       80 |             if (edges BOOST_GRAPH_MEMBER first
     689          |             ^~
     690    In file included from ${SPACK_DIR}/opt/spack/linux-centos7-haswell/gcc-9.3.0/boost-1.74.0-ntwkp2xfc4s4yfihf3vufjlcyxwi7xb7/include/boost/graph/adjacency_list.hpp:255,
     691                     from /tmp/rspavel/spack-stage/spack-stage-flux-sched-0.11.0-bkvpissxa7rda2jsntnxmj3etqp7wzxv/spack-src/resource/readers/resource_spec_grug.cpp:26:
     692    ${SPACK_DIR}/opt/spack/linux-centos7-haswell/gcc-9.3.0/boost-1.74.0-ntwkp2xfc4s4yfihf3vufjlcyxwi7xb7/include/boost/graph/detail/adjacency_list.hpp:2186:23: note: '*((void*)& ei +48)' was declared here
     693     2186 |         edge_iterator ei, ei_end;

Different part of boost but it looks similar to this https://stackoverflow.com/questions/21755206/how-to-get-around-gcc-void-b-4-may-be-used-uninitialized-in-this-funct

I guess before I go too far down the rabbit hole of hacking in a -Wno-maybe-uninitialized is there a project approved method to work around this?

@dongahn
Copy link
Member

dongahn commented Sep 9, 2020

Please undo the changes in #697

This is due to the used C++ compiler + Boost issue.

Dong

@SteVwonder
Copy link
Member

SteVwonder commented Sep 10, 2020

Please undo the changes in #697

The big change in that PR was the CXXFLAGS change, and you can control that from Spack. So I believe this should work as a quick test:

spack install [email protected]  cxxflags="Wno-unused-local-typedefs -Wno-deprecated-declarations -Wno-unused-variable -Wno-error" ^[email protected]

@SteVwonder
Copy link
Member

is there a project approved method to work around this?

Not yet. We have it on our September release roadmap to add -isystem to the boost libraries, to avoid this issue in the future: #300

@rspavel
Copy link
Author

rspavel commented Sep 10, 2020

Was able to get a completed build by just specifying -Wno-maybe-uninitialized' as a cxxflag. Still need some testing but I suspect that just may be due to current filesystem issues.

Will make a PR to upstream spack later so that it will hopefully just work with spack install flux-sched

Thanks

@SteVwonder SteVwonder added this to the 2020 September Release milestone Sep 30, 2020
@mergify mergify bot closed this as completed in #757 Oct 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants