-
-
Notifications
You must be signed in to change notification settings - Fork 481
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Graph method edge_disjoint_spanning_trees() can hang #38831
Comments
On macOS 14.7 and sagemath 10.5.beta7, I get sage: %time G.edge_disjoint_spanning_trees(5, solver='glpk', verbose=3)
GLPK Simplex Optimizer 5.0
1883 rows, 1355 columns, 10390 non-zeros
0: obj = -0.000000000e+00 inf = 6.600e+02 (415)
443: obj = -0.000000000e+00 inf = 1.588e-14 (0) 1
OPTIMAL LP SOLUTION FOUND
GLPK Integer Optimizer 5.0
1883 rows, 1355 columns, 10390 non-zeros
1215 integer variables, all of which are binary
Preprocessing...
140 hidden covering inequaliti(es) were detected
1864 rows, 1310 columns, 10110 non-zeros
1170 integer variables, all of which are binary
Scaling...
A: min|aij| = 1.000e+00 max|aij| = 2.800e+01 ratio = 2.800e+01
GM: min|aij| = 9.438e-01 max|aij| = 1.060e+00 ratio = 1.123e+00
EQ: min|aij| = 8.945e-01 max|aij| = 1.000e+00 ratio = 1.118e+00
2N: min|aij| = 5.000e-01 max|aij| = 1.562e+00 ratio = 3.125e+00
Constructing initial basis...
Size of triangular part is 1859
Solving LP relaxation...
GLPK Simplex Optimizer 5.0
1864 rows, 1310 columns, 10110 non-zeros
443: obj = -0.000000000e+00 inf = 5.626e+02 (302)
900: obj = -0.000000000e+00 inf = 1.019e-13 (0) 4
OPTIMAL LP SOLUTION FOUND
Integer optimization begins...
Long-step dual simplex will be used
+ 900: mip = not found yet <= +inf (1; 0)
+ 2634: >>>>> 0.000000000e+00 <= 0.000000000e+00 0.0% (215; 17)
+ 2634: mip = 0.000000000e+00 <= tree is empty 0.0% (0; 461)
INTEGER OPTIMAL SOLUTION FOUND
CPU times: user 1.42 s, sys: 2.12 ms, total: 1.42 s
Wall time: 1.42 s
[Digraph on 28 vertices,
Digraph on 28 vertices,
Digraph on 28 vertices,
Digraph on 28 vertices,
Digraph on 28 vertices]
sage:
sage: %time G.edge_disjoint_spanning_trees(5, solver='cplex', verbose=3)
Version identifier: 22.1.1.0 | 2022-11-28 | 9160aff4d
Tried aggregator 1 time.
MIP Presolve eliminated 184 rows and 45 columns.
MIP Presolve modified 2340 coefficients.
Reduced MIP has 1694 rows, 1310 columns, and 6540 nonzeros.
Reduced MIP has 1170 binaries, 0 generals, 0 SOSs, and 0 indicators.
Presolve time = 0.00 sec. (5.19 ticks)
Probing time = 0.00 sec. (1.47 ticks)
Tried aggregator 1 time.
Detecting symmetries...
Reduced MIP has 1694 rows, 1310 columns, and 6540 nonzeros.
Reduced MIP has 1170 binaries, 0 generals, 0 SOSs, and 0 indicators.
Presolve time = 0.00 sec. (5.03 ticks)
Probing time = 0.00 sec. (1.50 ticks)
Clique table members: 559.
MIP emphasis: balance optimality and feasibility.
MIP search method: dynamic search.
Parallel mode: deterministic, using up to 10 threads.
Root relaxation solution time = 0.00 sec. (4.58 ticks)
Nodes Cuts/
Node Left Objective IInf Best Integer Best Bound ItCnt Gap
0 0 0.0000 263 0.0000 372
0 0 0.0000 16 Cuts: 6 387
0 0 0.0000 22 Cuts: 9 410
0 0 0.0000 4 Covers: 1 412
0 0 0.0000 16 Cuts: 2 429
* 0+ 0 0.0000 0.0000 0.00%
0 0 cutoff 0.0000 0.0000 429 0.00%
Elapsed time = 0.09 sec. (155.42 ticks, tree = 0.01 MB, solutions = 1)
Cover cuts applied: 3
Flow cuts applied: 2
Mixed integer rounding cuts applied: 2
Lift and project cuts applied: 1
Gomory fractional cuts applied: 1
Root node processing (before b&c):
Real time = 0.09 sec. (155.50 ticks)
Parallel b&c, 10 threads:
Real time = 0.00 sec. (0.00 ticks)
Sync time (average) = 0.00 sec.
Wait time (average) = 0.00 sec.
------------
Total (root+branch&cut) = 0.09 sec. (155.50 ticks)
CPU times: user 698 ms, sys: 17.1 ms, total: 716 ms
Wall time: 134 ms
[Digraph on 28 vertices,
Digraph on 28 vertices,
Digraph on 28 vertices,
Digraph on 28 vertices,
Digraph on 28 vertices] |
Clearly, one should implement a combinatorial tree-packing algorithm like https://doc.sagemath.org/html/en/reference/references/index.html#gabow1995. So far we only have the part to compute the edge connectivity https://doc.sagemath.org/html/en/reference/graphs/sage/graphs/edge_connectivity.html. But the tree-packing algorithm is not easy to implement. Help is more than welcome here. For undirected graphs, we have the Roskind-Tarjan algorithm. |
With
GLPK upstream is probably not going to be much help if that turns out to be the issue. |
I don't know what to do here. I have the same behavior on macOS and Fedora 39 (i.e., works well). The only solution I see, as already discussed with @dimpase long time ago is to implement a combinatorial algorithm and avoid linear programming here. We already changed the formulation (see #32169), but it's apparently not sufficient to avoid issues. |
Rebuilding GLPK with |
It's OK at edit: OK at |
Clang, on the other hand, shows no improvement at |
then this should be reported upstream. |
GLPK upstream is not quite dead -- the maintainer responds on the "help" list sometimes -- but there's no public source tree and no response to e.g. basic build fixes reported years ago (https://lists.gnu.org/archive/html/bug-glpk/2020-03/msg00003.html, https://lists.gnu.org/archive/html/bug-glpk/2022-08/msg00000.html). I'll still do it, but I wouldn't get your hopes up that it will lead to a fix. |
In a surprise plot twist, if I write the LP to a file and then hand it to |
does it means that there is something wrong with our interface ? |
Not necessarily. It could be that glpsol is using the API in a different way that bypasses a bug in the library, or it could be cython that's generating buggy code, or it could even still be a compiler issue that is compiling the cython code different than the glpsol code. Of course it could also be our interface. I'll have to dig into the GLPK internals later to see what I can learn from inside the two "solve" routines. |
as |
I've spent the better part of two days debugging this, painstakingly adding recursive printf statements to dump the entire internal representation of the LP, only to find... that
it's because the resulting LP (while hopefully equivalent) is not identical. If I use Since #34575 reports the same issue on ARM, and since I'm seeing it on RISC-V but only with certain compilers and optimization levels, my current guess is that it's some obscure numerical issue that gets consumed by architecture-dependent compiler optimizations. |
it doesn't explain why |
PPL fails consistently though, that's easier to explain. |
One doctest in this file is "hanging" on ARM64 and RISC-V as GLPK tries courageously to solve a MIP. A tweak to the solver options allows this problem to be solved on those two architectures without affecting any others. This is unlikely to solve the general problem, but it may buy us some time. Closes sagemath#34575 Closes sagemath#38831
One doctest in this file is "hanging" on ARM64 and RISC-V as GLPK tries courageously to solve a MIP. A tweak to the solver options allows this problem to be solved on those two architectures without affecting any others. This is unlikely to solve the general problem, but it may buy us some time. Closes sagemath#34575 Closes sagemath#38831
I have a new machine where it looks like #32169 has returned. With the setup from https://github.com/sagemath/sage/blob/develop/src/sage/graphs/generic_graph.py#L7214,
The following complete quickly,
But the example that is doctested...
hangs "indefinitely" (I stopped waiting after 12 hours). If I Ctrl-C it from an interactive session, it looks like it gets stuck while trying to solve the MIP:
The text was updated successfully, but these errors were encountered: