-
Notifications
You must be signed in to change notification settings - Fork 62
MPI build and segfault #190
Comments
I never successfully tracked down the cause of the segfault. I do not remember what it was, but something else preempted this task, and I did not get back around to finding the problem. I do remember that appears that some process is attempting to write to a file descriptor that it didn't open. My suspicion at the time was that it was somehow related to MALOC. Given that MALOC and FETK don't even properly include MPI libs during their configuration process, it's possible that MPI support is broken even worse than just not including libraries. |
OK; thanks for the update. I think we need to do a better job documenting these problems via issues. On Mon, Aug 31, 2015 at 9:57 AM, Keith T. Star [email protected]
|
Very much agreed. I'm disappointed that I let this one slip. |
My plan is to create MALOC and FETK repositories here under Electrostatics, and populate them with what I used for the 1.4.1 release. At that point, I can update the configure scripts to actually link against the MPI libs and update our APBS build to depend on these two repos. When that's all done, I can dig into the code and find the bit that's causing this segfault. Sound OK @sobolevnrm? @lizutah? |
Sounds reasonable to me. |
Yes. I'm in favor of ditching MPI eventually... there are large enough On Tue, Sep 1, 2015 at 1:38 PM, Keith T. Star [email protected]
|
Status UpdateGaaahhhhh!! It all comes flooding back... Sadly the MALOC in the FETK I'm using is built with autotools. That's a nonstarter for Windows. I think the best route is to replace the MALOC source in the FETK tree with the CMake enabled version that Andrew and Kyle created. |
The master branch now has support for FETK. It depends on a Git submodule that points to our FETK repository. If you invoke cmake with -DENABLE_FETK=ON it will (on non-Win32 boxes) build FETK using autotools and link against that. If you don't enable FETK, it will use the CMake build system we bolted to MALOC to build MALOC and use that. Now to update the configure scripts to include the MPI libs... |
…t used to use) when building with FETK. I also fixed a spelling error in the CMakeLists.txt files that probably seriously wrecked FETK builds. Finally, I bumped the APBS version to what it should be.
…dded HAVE_MPI_H define so that the APBS MPI code thinks it can. I think I can, I think I can, ... For issue #190.
…. This (hopefully, once and for all) closes isue #190.
This turned out to be a combination of so many different issues. The configure.ac files in FETk weren't including the MPI libs during the final link, and they were also missing from some intermediate compilation tests. Defines for compilation were not getting set in in the main CMake file. They were always being set in the MALOC CMake file. It should work with FETk (built-in MALOC) and without (external, CMake-based MALOC, which is now source integrated with FETk's, BTW). There are probably other problems that got fixed and I don't remember -- but at least the commit history exists. NB: There are potentially complications that will arise if you build both with and without FETk from the same repo clone. At any rate, I've had this build and run successfully on Constance and Olympus. YMMV, and I won't be surprised if we end up fielding support requests in the future. But hopefully not. |
I was working an a release for SDSC and ran into a few problems.
The first is that the FETK configure.in files do not include the mpi libraries when performing their MALOC linkage test. The result is that they err thinking that MALOC is not available, where in fact there are undefined MPI symbols in MALOC.
The second is that once FETK and MALOC are properly built, and an MPI run is attempted, a segfault is thrown while trying to print the output of the run to stdio.
The text was updated successfully, but these errors were encountered: