Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Contingency analysis application fails on Polish network #191

Open
bjpalmer opened this issue Dec 12, 2023 · 16 comments
Open

Contingency analysis application fails on Polish network #191

bjpalmer opened this issue Dec 12, 2023 · 16 comments
Assignees

Comments

@bjpalmer
Copy link
Contributor

I just tried running the contingency analysis application using the Polish network and it fails after running a few of the contingencies. I think this is because the PETSc solver is failing but the resulting error is not getting properly trapped. The powerflow solve method is suppose to trap the error and then return false so that the application can keep going, but this does not appear to be happening.

@wperkins
Copy link
Member

When I run the case with 1 process, it gets through 311 tasks (or so) and then exits -- exit() is called. PETSc never gets a chance to report convergence to GridPACK, so no exception can be thrown. This is on Ubuntu with complex PETSc 3.19.4 and SuperLU_dist.

#0  __GI_exit (status=status@entry=1) at exit.c:138
#1  0x0000155552a08c08 in ztrsm_ (side=side@entry=0x1555520d4176 "R", 
    uplo=uplo@entry=0x1555520d4170 "U", transa=transa@entry=0x1555520d4172 "N", 
    diag=diag@entry=0x1555520d4172 "N", m=m@entry=0x7fffffffbef8, n=n@entry=0x7fffffffbf08, 
    alpha=0x7fffffffbf40, a=<optimized out>, lda=0x7fffffffbf0c, b=<optimized out>, 
    ldb=0x7fffffffbf04) at ztrsm.c:324
#2  0x00001555520a149b in pzgstrf2_trsm (options=options@entry=0x55555c16edf0, k0=k0@entry=273, 
    k=k@entry=366, thresh=thresh@entry=6.2290136121758314e-07, 
    Glu_persist=Glu_persist@entry=0x55555c0534c0, grid=grid@entry=0x55555c16ed28, 
    Llu=Llu@entry=0x55555c28b840, U_diag_blk_send_req=0x0, tag_ub=2147483647, stat=0x7fffffffc8b0, 
    info=0x7fffffffc880)
    at /home/d3g096/Projects/GridPakLDRD/petsc-3.19.4/ubuntu-complex-shared/externalpackages/git.superlu_dist/SRC/pzgstrf2.c:312
#3  0x000015555209ae99 in pzgstrf (options=options@entry=0x55555c16edf0, m=m@entry=5991, 
    n=n@entry=5991, anorm=anorm@entry=10.450550683841415, LUstruct=LUstruct@entry=0x55555c16eee8, 
    grid=grid@entry=0x55555c16ed28, stat=stat@entry=0x7fffffffc8b0, info=0x7fffffffc880)
    at /home/d3g096/Projects/GridPakLDRD/petsc-3.19.4/ubuntu-complex-shared/externalpackages/git.superlu_dist/SRC/pzgstrf.c:1137
#4  0x000015555207a502 in pzgssvx (options=options@entry=0x55555c16edf0, A=0x55555c16eea0, 
    ScalePermstruct=0x55555c16eec0, B=B@entry=0x0, ldb=5991, nrhs=nrhs@entry=0, 
    grid=0x55555c16ed28, LUstruct=0x55555c16eee8, SOLVEstruct=0x55555c16ef10, berr=0x0, 
    stat=0x7fffffffc8b0, info=0x7fffffffc880)
    at /home/d3g096/Projects/GridPakLDRD/petsc-3.19.4/ubuntu-complex-shared/externalpackages/git.superlu_dist/SRC/pzgssvx.c:1181
#5  0x00001555539a9569 in MatLUFactorNumeric_SuperLU_DIST (F=0x55555c2e2b40, A=0x55555e04cac0, 
    info=<optimized out>)
    at /home/d3g096/Projects/GridPakLDRD/petsc-3.19.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c:447
#6  0x000015555347a591 in MatLUFactorNumeric (fact=0x55555c2e2b40, mat=0x55555e04cac0, 
--Type <RET> for more, q to quit, c to continue without paging--
    info=info@entry=0x555558fcd388)
    at /home/d3g096/Projects/GridPakLDRD/petsc-3.19.4/src/mat/interface/matrix.c:3243
#7  0x0000155553ee4e53 in PCSetUp_LU (pc=0x55555dd16aa0)
    at /home/d3g096/Projects/GridPakLDRD/petsc-3.19.4/src/ksp/pc/impls/factor/lu/lu.c:120
#8  0x0000155553e3fad0 in PCSetUp (pc=0x55555dd16aa0)
    at /home/d3g096/Projects/GridPakLDRD/petsc-3.19.4/src/ksp/pc/interface/precon.c:994
#9  0x0000155553f52f51 in KSPSetUp (ksp=0x55555c04cdf0)
    at /home/d3g096/Projects/GridPakLDRD/petsc-3.19.4/src/ksp/ksp/interface/itfunc.c:406
#10 0x0000155553f545ea in KSPSolve_Private (ksp=0x55555c04cdf0, b=0x555558eab140, x=<optimized out>)
    at /home/d3g096/Projects/GridPakLDRD/petsc-3.19.4/src/ksp/ksp/interface/itfunc.c:824
#11 0x0000155553f55907 in KSPSolve (ksp=<optimized out>, b=<optimized out>, x=<optimized out>)
    at /home/d3g096/Projects/GridPakLDRD/petsc-3.19.4/src/ksp/ksp/interface/itfunc.c:1070
#12 0x00001555552e306e in gridpack::math::PETScLinearSolverImplementation<double, int>::p_resolveImpl (this=0x55555dc3cb40, b=..., x=...)
    at /home/d3g096/Projects/GridPACK-Wind/src/GridPACK/src/math/petsc/petsc_vector_extractor.hpp:55
#13 0x00001555552dfeb5 in gridpack::math::PETScLinearSolverImplementation<double, int>::p_solveImpl
    (this=0x55555dc3cb40, A=..., b=..., x=...)
    at /home/d3g096/Projects/GridPACK-Wind/src/GridPACK/src/math/petsc/petsc_matrix_extractor.hpp:57
#14 0x00001555552e2078 in gridpack::math::LinearSolverImplementation<double, int>::p_solve (
    this=0x55555dc3cb40, b=..., x=...)
    at /home/d3g096/Projects/GridPACK-Wind/src/GridPACK/src/gridpack/math/vector_interface.hpp:383
#15 0x000015555543e7c8 in gridpack::math::BaseLinearSolverInterface<double, int>::solve (x=..., 
    b=..., this=0x7fffffffcfe0)
    at /home/d3g096/Projects/GridPACK-Wind/src/GridPACK/src/gridpack/math/linear_solver_interface.hpp:116
#16 gridpack::powerflow::PFAppModule::solve (this=this@entry=0x7fffffffdb80)
    at /home/d3g096/Projects/GridPACK-Wind/src/GridPACK/src/applications/modules/powerflow/pf_app_module.cpp:419
#17 0x000055555556b711 in gridpack::contingency_analysis::CADriver::execute (this=<optimized out>, 
--Type <RET> for more, q to quit, c to continue without paging--
    argc=<optimized out>, argv=<optimized out>)
    at /home/d3g096/Projects/GridPACK-Wind/src/GridPACK/src/applications/contingency_analysis/ca_driver.cpp:501
#18 0x0000555555567377 in main (argc=<optimized out>, argv=<optimized out>)
    at /home/d3g096/Projects/GridPACK-Wind/src/GridPACK/src/applications/contingency_analysis/ca_main.cpp:38

@abhyshr
Copy link
Collaborator

abhyshr commented Dec 12, 2023

Hi Bruce and Bill, who wrote the contingency analysis application and when was it last run or tested?

@wperkins
Copy link
Member

Hi Bruce and Bill, who wrote the contingency analysis application and when was it last run or tested?

The unit test runs on the 14-bus case.

@bjpalmer
Copy link
Contributor Author

bjpalmer commented Dec 12, 2023

I wrote most of the contingency analysis application. I can't remember when the last time anyone was able to run the Polish or European network test cases. I think it has always been true that some of the contingencies have solver failures (beyond failing to converge) so this seems like it may be a new problem.

@bjpalmer
Copy link
Contributor Author

I saw something similar. I also tried running with the KLU solver and I think the code just crashes from somewhere inside PETSc (I'm getting a PETSC stack trace). It looks like the exit is coming from somewhere inside SuperLU. Is there some way to keep that from happening and just return an error to the calling program?

@wperkins
Copy link
Member

The Polish case runs to completion for me with 1 process using real PETSc 3.19.4 and SuperLU_dist.

@wperkins
Copy link
Member

I saw something similar. I also tried running with the KLU solver and I think the code just crashes from somewhere inside PETSc (I'm getting a PETSC stack trace). It looks like the exit is coming from somewhere inside SuperLU. Is there some way to keep that from happening and just return an error to the calling program?

I don't see how without modifying SuperLU_dist.

@bjpalmer
Copy link
Contributor Author

Seems like an oversight on SuperLU to do that. If it fails it should hand control back to the user to figure out how to deal with it.

The Polish case runs to completion for me with 1 process using real PETSc 3.19.4 and SuperLU_dist.

I'm using complex PETSc 3.16.3 and shared libraries. I'll see what happens with reals.

@wperkins
Copy link
Member

The Polish case runs to completion for me with 1 process using real PETSc 3.19.4 and SuperLU_dist.

Seems really slow with more processors, though.

@bjpalmer
Copy link
Contributor Author

Seems really slow with more processors, though.

Are you using the two-sided runtime?

@wperkins
Copy link
Member

Seems really slow with more processors, though.

Are you using the two-sided runtime?

Yes. But, it runs fine using 8 processes with complex MUMPS.

@bjpalmer
Copy link
Contributor Author

How about with progress ranks?

@wperkins
Copy link
Member

How about with progress ranks?

I haven't tried. I generally don't build GA that way.

@wperkins
Copy link
Member

I notice none of these input files set <PETScPrefix> for the LinearSolver

Seems really slow with more processors, though.

Are you using the two-sided runtime?

Yes. But, it runs fine using 8 processes with complex MUMPS.

Also seems fine with using 8 processes and the complex PETSc LU solver. This is really looking like a SuperLU_dist problem to me.

@wperkins
Copy link
Member

Seems fine with plain SuperLU (not dist) as well.

wperkins added a commit that referenced this issue Dec 13, 2023
* Fatal error if `pkg-config` is not found (#161).

* Remove unnecessary CMake find package files (left over from #182)

* Fix input data files using old PETSc options (left over from #164)

* Contingency analysis really should set `PETScPrefix` (#191)
@bjpalmer
Copy link
Contributor Author

I ran with just SuperLU and it works for me too. I also tried it on the European open model and it works for that one too. We should change the input files for this calculation (and the european network) and call it good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants