Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2.10.9 update crashes in CbcBaseModel::waitForThreadsInTree #591

Closed
jschueller opened this issue Apr 14, 2023 · 24 comments
Closed

2.10.9 update crashes in CbcBaseModel::waitForThreadsInTree #591

jschueller opened this issue Apr 14, 2023 · 24 comments

Comments

@jschueller
Copy link

jschueller commented Apr 14, 2023

since the cbc update 2.10.9, cbc crashes via bonmin on its simple c++ example (https://github.com/coin-or/Bonmin/tree/master/examples/CppExample)

******************************************************************************
This program contains Ipopt, a library for large-scale nonlinear optimization.
 Ipopt is released as open source code under the Eclipse Public License (EPL).
         For more information visit https://github.com/coin-or/Ipopt
******************************************************************************

NLP0012I 
              Num      Status      Obj             It       time                 Location
NLP0014I             1         OPT -2.618034       13 0.005785
NLP0014I             2         OPT -2.618034        9 0.003997
NLP0014I             3         OPT -2       10 0.004034
NLP0014I             4         OPT -1.7071068        8 0.003429
NLP0014I             5         OPT -2.5001414       20 0.006979
Cbc0010I After 0 nodes, 1 on tree, 1e+50 best solution, best possible -1.7976931e+308 (0.02 seconds)
NLP0014I             6         OPT -2.5001414       20 0.003155
NLP0014I             7         OPT -1.7071068        8 0.003414
NLP0014I             8         OPT -2.5001414       19 0.006966

Program received signal SIGSEGV, Segmentation fault.
CbcBaseModel::waitForThreadsInTree (this=0x5602947cca40, type=type@entry=2) at CbcThread.cpp:831
831         if (!baseModel->tree()->empty()) {
#0  CbcBaseModel::waitForThreadsInTree (this=0x5602947cca40, type=type@entry=2) at CbcThread.cpp:831
#1  0x00007fba92d30b15 in CbcModel::branchAndBound (this=0x7ffe141d0708, doStatistics=0) at CbcModel.cpp:5194
#2  0x00007fba9378f7ba in Bonmin::Bab::branchAndBound(Bonmin::BabSetupBase&) () from /usr/lib/libbonmin.so.4
#3  0x0000560294328a8b in main (argc=<optimized out>, argv=<optimized out>) at MyBonmin.cpp:80

Careful: dependent packages vol+bcp+bonmin have to be recompiled when cbc gets updated to see the exact same error, else it might crash elsewhere.
I used all latest cgl 0.60.7, clp 1.17.8, osi 0.108.8, coinutils 2.11.8, bonmin 1.8.9, vol 1.5.4, bcp 1.4.4, ipopt 3.14.12
It is fine if I revert to 2.10.8 (and rebuild vol/bcp/bonmin)

this is with gcc 12.2.1 / archlinux

CXXFLAGS=

-march=x86-64 -mtune=generic -O2 -pipe -fno-plt -fexceptions \
        -Wp,-D_FORTIFY_SOURCE=2 -Wformat -Werror=format-security \
        -fstack-clash-protection -fcf-protection -g

LDFLAGS="-Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now"
@jschueller jschueller changed the title 2.10.8->2.10.9 update crashes in CbcBaseModel::waitForThreadsInTree 2.10.9 update crashes in CbcBaseModel::waitForThreadsInTree Apr 14, 2023
@tkralphs
Copy link
Member

tkralphs commented Apr 15, 2023

Could you say more about exactly how you built the projects? What were your configure options? For example, it looks like you built Cbc with --enable-cbc-parallel? I built Bonmin 1.8 with the tip of each dependent stable branch. For the Cbc stack and Bonmin itself, these are identical to the current latest releases. With that setup, everything worked fine on my machine. Can you try that also and see if it works? If so, that will give us a data point for narrowing down the issue.

coinbrew fetch [email protected]
coinbrew build Bonmin

@jschueller
Copy link
Author

I tried running coinbrew but it failed to build mumps:

$ coinbrew build Bonmin@stable/1.8.9
...
##################################################
### Building ThirdParty/Mumps 1.6 
##################################################

.../ThirdParty/Mumps/MUMPS/src/dmumps_comm_buffer.F:2667:23:

 2658 |         CALL MPI_PACK( WHAT, 1, MPI_INTEGER,
      |                       2
......
 2667 |         CALL MPI_PACK( LIST_SLAVES, NSLAVES, MPI_INTEGER,
      |                       1
Error: Rank mismatch between actual argument at (1) and actual argument at (2) (scalar and rank-1)
.../ThirdParty/Mumps/MUMPS/src/dmumps_comm_buffer.F:2670:23:

@jschueller
Copy link
Author

I will try to rebuild without --enable-cbc-parallel and bisect cbc next week.

@tkralphs
Copy link
Member

For the Mumps issue, I guess adding ADD_FFLAGS=-fallow-argument-mismatch should fix the problem, per coin-or/coinbrew#47.

@tkralphs
Copy link
Member

I now built the full stack with --enable-cbc-parallel and it still works fine for me.

@jschueller
Copy link
Author

It also happens with default compilation flags, so this is not caused by archlinux compilation flags

@jschueller
Copy link
Author

jschueller commented Apr 17, 2023

it seems to come from 1e6e301 what does this do @jjhforrest ?

1e6e3016003941406c5a63dd022c53d958111c35 is the first bad commit
commit 1e6e3016003941406c5a63dd022c53d958111c35
Author: John Forrest <[email protected]>
Date:   Fri May 27 17:37:23 2022 +0100

    to allow root symmetry orbital

 Cbc/src/CbcModel.cpp    |  388 ++++++++++++-
 Cbc/src/CbcModel.hpp    |   16 +
 Cbc/src/CbcNode.cpp     |  107 ++++
 Cbc/src/CbcSymmetry.cpp | 1377 +++++++++++++++++++++++++++++++++++++++--------
 Cbc/src/CbcSymmetry.hpp |   82 ++-
 5 files changed, 1720 insertions(+), 250 deletions(-)
bisect found first bad commit

@jschueller
Copy link
Author

the error disappears if I disable nauty

@jschueller
Copy link
Author

@tkralphs what distro / compiler do you use ? do you enable nauty ?

@jjhforrest
Copy link
Contributor

I am unable to reproduce this error - with or without nauty. The symmetry breaking code is only entered if user has set option - and that is not set in example.

@jschueller
Copy link
Author

what distro / compiler are you using ?

@jjhforrest
Copy link
Contributor

Bonmin 1.8.9 and gcc 11.3

@jschueller
Copy link
Author

jschueller commented Apr 17, 2023

Is that the version from ubuntu ?
I'm using gcc 12.2

@jjhforrest
Copy link
Contributor

Ubuntu

jschueller added a commit to jschueller/Cbc that referenced this issue Apr 17, 2023
COIN_HAS_NTY is only defined during compilation of Cbc but is used in
public header CbcModel.hpp
Hiding attributes to 3rd party code can lead to crashes from alignment issues

Closes coin-or#591
@jschueller
Copy link
Author

jschueller commented Apr 17, 2023

I think I figured it out: COIN_HAS_NTY private macro is used in public header CbcModel.hpp
so 3rd party code dont see a class with the same size, I guess this causes the crash with gcc 12, an alignement issue or something
if I just drop the COIN_HAS_NTY from the definition of the attributes it fixes my crash locally

jschueller added a commit to jschueller/Cbc that referenced this issue Apr 17, 2023
COIN_HAS_NTY is only defined during compilation of Cbc but is used in
public header CbcModel.hpp
We also have to unconditionnaly include CbcSymmetry.hpp in case nauty is disabled
Hiding attributes to 3rd party code can lead to crashes from alignment issues

Fixes 1e6e301
Closes coin-or#591
@svigerske
Copy link
Member

Yes, that could be it. The use of COIN_HAS_NTY in public headers was indeed introduced with 1e6e301.
(On master, this is named CBC_HAS_NTY and the define is exported in CbcConfig.h, but not on this ancient stable branch.)

@jschueller
Copy link
Author

I proposed to drop it but maybe its best to define it in the config header as you mentionned

@svigerske
Copy link
Member

Problem with the name COIN_HAS_NTY is that it is too generic. If someone still tries to build Couenne, then it could give a conflict there (probably just a compiler warning, though).

@jschueller
Copy link
Author

jschueller commented Apr 17, 2023

lets drop it then as in #593 ?

@svigerske
Copy link
Member

I leave this to Ted or John. I don't really see why 1e6e301 has been added to stable/2.10 in the first place. It doesn't look like a bugfix to me and the changelog entry in the readme isn't enlightening either. If they want to keep it, then #593 is also fine with me.

@tkralphs
Copy link
Member

I missed that there were more than just bug fixes being introduced in this release. #593 looks to be the right fix. I will merge and make a new release unless @jjhforrest objects.

@jjhforrest
Copy link
Contributor

go ahead - I will try and remember not to put anything interesting into stable - just fixes.

@jschueller
Copy link
Author

jschueller commented Apr 17, 2023

yes, boring stuff only please :]

tkralphs added a commit that referenced this issue Apr 18, 2023
COIN_HAS_NTY is only defined during compilation of Cbc but is used in
public header CbcModel.hpp

* Fixes bug introduced in 1e6e301
* Closes #591
* Also, fixes the Github workflow so that we're building with nuaty on Linux, as was intended.
* Eliminate zapRootSymmetry(), since it is not defined or used.

---------

Co-authored-by: Ted Ralphs <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants