Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix non-working unit tests and smoke tests for new features. #164

Merged
merged 21 commits into from
Nov 22, 2023

Conversation

wperkins
Copy link
Member

@wperkins wperkins commented Aug 29, 2023

The goal of this PR is to get GridPACK's unit testing back into working order. The main problem is Matrix<> tests, but some applications fail too. In addition, some new features of dynamic simulation need smoke tests. This should fix #118.

@wperkins
Copy link
Member Author

Changes added should close #160. See comment there.

@wperkins wperkins self-assigned this Aug 29, 2023
@wperkins wperkins requested review from abhyshr and bjpalmer August 29, 2023 21:56
@wperkins
Copy link
Member Author

wperkins commented Aug 29, 2023

@abhyshr, I was adding a smoke test for input_twoarea_renewable_mech.xml, but it refers to Benchmark_twoarea_v33.raw which does not exist in the raw directory. Did it not get checked in?

@wperkins wperkins mentioned this pull request Aug 29, 2023
@abhyshr
Copy link
Collaborator

abhyshr commented Aug 29, 2023

Bill, you can use the same two area system network file for the tests. I renamed the file and may have forgotten to change its name in different input.xml files.

@wperkins
Copy link
Member Author

PETSc matrix transpose semantics have changed such that implementing transpose into an existing Matrix<> is very difficult. So, I'm going to remove this function

template <typename T, typename I>
void  transpose(const MatrixT<T, I>& A, MatrixT<T, I>& result);

It does not appear in any application code. The function creating a new Matrix<> containing a transpose, i.e.,

template <typename T, typename I>
MatrixT<T, I> *transpose(const MatrixT<T, I>& A);

is still available and should be used exclusively now.
This is the cause of some matrix test failures.

@bjpalmer
Copy link
Contributor

I just tried building and running 240 bus case in the fix/testing branch. I'm not seeing a crash at the end. Ths is my environment

[d3g293@constance01 ~]$ module list
Currently Loaded Modulefiles:
  1) paraview/5.8.1        3) openmpi/4.1.4         5) cmake/3.22.1          7) java/13.0.2           9) texlive/2017
  2) gcc/12.1.0            4) python/2.7.3          6) git/2.11.0(default)   8) zeromq/4.1.4

and this is how I'm configuring

[d3g293@constance01 build_ts]$ more build_csh 
rm -rf CMake*

cmake -Wdev \
      -D PETSC_DIR:STRING='/pic/projects/gridpack/software/petsc-3.16.3' \
      -D PETSC_ARCH:STRING='linux-openmpi-gnu-cxx-complex-opt-so' \
      -D GA_DIR:STRING='/pic/projects/gridpack/software/ga-5.7/build_ts' \
      -D USE_PROGRESS_RANKS:BOOL=TRUE \
      -D GA_EXTRA_LIBS='-lrt' \
      -D MPI_CXX_COMPILER:STRING='mpicxx' \
      -D MPI_C_COMPILER:STRING='mpicc' \
      -D MPIEXEC:STRING='mpiexec' \
      -D MPIEXEC_MAX_NUMPROCS:STRING="3" \
      -D BOOST_ROOT:STRING='/pic/projects/gridpack/software/boost_1_65_0' \
      -D CMAKE_INSTALL_PREFIX:PATH='/people/d3g293/tmp/gridpack-wind/src/build_ts/install' \
      -D CMAKE_BUILD_TYPE:STRING='DEBUG' \
      -D CMAKE_VERBOSE_MAKEFILE:STRING=TRUE \
      CXXFLAGS="-pthread -fsanitize=address" \
      ..

@wperkins
Copy link
Member Author

I just tried building and running 240 bus case in the fix/testing branch. I'm not seeing a crash at the end. Ths is my environment
[...]

I too do not get any errors when building Debug. dsf.x and dsf2.x both run fine. But, if I build RelWithDebInfo, both dsf.x and dsf2.x consistently produce a SEGV here

#1  0x000015555547c33e in gridpack::dynamic_simulation::DSFullFactory::securityCheck (this=0x555555bc0230)
    at /home/d3g096/Projects/GridPACK-Wind/src/GridPACK/src/applications/modules/dynamic_simulation_full_y/dsf_factory.cpp:408
408           angle = p_buses[i]->getAngle();

I've consistently get the SEGV at i=105 when running serial. It's different i in parallel. I don't know if it's consistent though. Here's the full stack trace:

#0  0x000015555547fdb8 in gridpack::dynamic_simulation::DSFullBus::getAngle (
    this=<optimized out>) at /usr/include/c++/9/bits/stl_vector.h:1040
#1  0x000015555547c33e in gridpack::dynamic_simulation::DSFullFactory::securityCheck (this=0x55555598aaf0)
    at /home/d3g096/Projects/GridPACK-Wind/src/GridPACK/src/applications/modules/dynamic_simulation_full_y/dsf_factory.cpp:408
#2  0x0000155555479090 in gridpack::dynamic_simulation::DSFullApp::runonestep (
    this=0x7fffffffdc50)
    at /home/d3g096/Projects/GridPACK-Wind/src/GridPACK/src/applications/modules/dynamic_simulation_full_y/dsf_app_module2.cpp:324
#3  0x00001555554793d0 in gridpack::dynamic_simulation::DSFullApp::run (
    this=0x7fffffffdc50, tend=10)
    at /home/d3g096/Projects/GridPACK-Wind/src/GridPACK/src/applications/modules/dynamic_simulation_full_y/dsf_app_module2.cpp:409
#4  0x0000555555566b37 in run_dynamics (argc=<optimized out>, 
    argv=<optimized out>)
    at /home/d3g096/Projects/GridPACK-Wind/src/GridPACK/src/applications/dynamic_simulation_full_y/dsf_main2.cpp:81
#5  0x0000555555566066 in main (argc=<optimized out>, argv=<optimized out>)
    at /home/d3g096/Projects/GridPACK-Wind/src/GridPACK/src/applications/dynamic_simulation_full_y/dsf_main2.cpp:105

@wperkins
Copy link
Member Author

@abhyshr, for the DSA application i have been using the data files here, specifically input files input_2000bus.xml and input_tamu2000_dsf.xml. I've discovered that these will not work with dsf.x because of the way events are read. These cases seem to work fine with dsf2.x.

@abhyshr
Copy link
Collaborator

abhyshr commented Aug 30, 2023

@wperkins wind2.x is the application you should be using for the DSA test, not wind.x. Do you not see the application when you build GridPACK? If not, can you rebase develop over your branch and retry. I merged the wind_dsa branch with develop. Also, using the 2000 bus file for testing wind_dsa would be too time-consuming since it is set to run a lot of contingencies. We'll probably need to use a smaller dataset.

@abhyshr
Copy link
Collaborator

abhyshr commented Aug 30, 2023

I just tried building and running 240 bus case in the fix/testing branch. I'm not seeing a crash at the end. Ths is my environment
[...]

I too do not get any errors when building Debug. dsf.x and dsf2.x both run fine. But, if I build RelWithDebInfo, both dsf.x and dsf2.x consistently produce a SEGV here

#1  0x000015555547c33e in gridpack::dynamic_simulation::DSFullFactory::securityCheck (this=0x555555bc0230)
    at /home/d3g096/Projects/GridPACK-Wind/src/GridPACK/src/applications/modules/dynamic_simulation_full_y/dsf_factory.cpp:408
408           angle = p_buses[i]->getAngle();

I've consistently get the SEGV at i=105 when running serial. It's different i in parallel. I don't know if it's consistent though. Here's the full stack trace:

#0  0x000015555547fdb8 in gridpack::dynamic_simulation::DSFullBus::getAngle (
    this=<optimized out>) at /usr/include/c++/9/bits/stl_vector.h:1040
#1  0x000015555547c33e in gridpack::dynamic_simulation::DSFullFactory::securityCheck (this=0x55555598aaf0)
    at /home/d3g096/Projects/GridPACK-Wind/src/GridPACK/src/applications/modules/dynamic_simulation_full_y/dsf_factory.cpp:408
#2  0x0000155555479090 in gridpack::dynamic_simulation::DSFullApp::runonestep (
    this=0x7fffffffdc50)
    at /home/d3g096/Projects/GridPACK-Wind/src/GridPACK/src/applications/modules/dynamic_simulation_full_y/dsf_app_module2.cpp:324
#3  0x00001555554793d0 in gridpack::dynamic_simulation::DSFullApp::run (
    this=0x7fffffffdc50, tend=10)
    at /home/d3g096/Projects/GridPACK-Wind/src/GridPACK/src/applications/modules/dynamic_simulation_full_y/dsf_app_module2.cpp:409
#4  0x0000555555566b37 in run_dynamics (argc=<optimized out>, 
    argv=<optimized out>)
    at /home/d3g096/Projects/GridPACK-Wind/src/GridPACK/src/applications/dynamic_simulation_full_y/dsf_main2.cpp:81
#5  0x0000555555566066 in main (argc=<optimized out>, argv=<optimized out>)
    at /home/d3g096/Projects/GridPACK-Wind/src/GridPACK/src/applications/dynamic_simulation_full_y/dsf_main2.cpp:105

I think that bus does not have an angle field. The angle field is set by a generator. For the 240 bus system, there are buses with generators that do not have angle variables and hence the segfault. What is triggering the securityCheck method? That should be turned off I think or made optional to check it only when some flag is set.

@wperkins
Copy link
Member Author

@wperkins wind2.x is the application you should be using for the DSA test, not wind.x. Do you not see the application when you build GridPACK? If not, can you rebase develop over your branch and retry. I merged the wind_dsa branch with develop. Also, using the 2000 bus file for testing wind_dsa would be too time-consuming since it is set to run a lot of contingencies. We'll probably need to use a smaller dataset.

Doh! wind2.x is there. I was using dsf2.x.

@bjpalmer
Copy link
Contributor

bjpalmer commented Sep 1, 2023

I tried running the 240 bus test case after compiling with RelWithDebInfo. The code crashes for me. According to Valgrind, it is trying to access some invalid data in this routine

/**
 * Get roter angle of generators
 */
double gridpack::dynamic_simulation::DSFullBus::getAngle()
{
  if (p_ngen <= 0) return 0.0;
  int i;
  printf("BUS: %d P_NGEN: %d state.size: %d status: %d\n",getOriginalIndex(),p_ngen,p_gstatus.size(),p_gstatus[i]);
  for (i = 0; i < p_ngen; i++) {
    if(!p_gstatus[i] || !p_generators.size()) continue;
    double angle = p_generators[i]->getAngle();
    return angle;
  }
}

I do not understand why Valgrind is reporting an error (both p_gstatus[0] and p_generators[0] appear to exist and have been initialized) but for some reason it is. This is the first generator with status 0 so maybe it has something to do with that. However, this routine seems to be pretty bogus as written. It basically returns the angle of the first generator on the bus if there is more than one and some indeterminate answer if there are no generators or the generators all have status 0. It should be cleaned up. Any suggestions on what it should do if there are multiple generators? Or maybe we need to modify the syntax so that this function takes a generator index. Also, if the generator is turned off and no value is set, should it return 0?

@abhyshr
Copy link
Collaborator

abhyshr commented Sep 1, 2023

@bruce: What is the bus number (external) for this generator? Let me check if this generator is of renewable type, in which case it does not have any Angle variable. The getAngle() method is only valid for conventional rotating machine generators. Trying to access Angle() for a renewable generator would be probably resulting in an error.

This piece of code needs to go away. Instead something like
bool getAngle(&angle); will be useful. It will return false for renewable generators and true for others with the value of the angle.

@bruce
Copy link

bruce commented Sep 1, 2023

👋 @abhyshr, did you mean @bjpalmer?

@bjpalmer
Copy link
Contributor

bjpalmer commented Sep 1, 2023 via email

@bjpalmer
Copy link
Contributor

bjpalmer commented Sep 1, 2023

I looked at this a bit more. I don't think Valgrind is catching the true problem but I think some sort of memory issue is being created if the status is set to 0. It could be that a data vector is not extended to account for that generator but later on somebody tries to access it anyway. I don't think it is actually anything to do with the getAngle function, but that is just where the system final steps on something that has already been corrupted and crashes.

@wperkins
Copy link
Member Author

All math tests are fixed. The problem with DAE solver tests was the addition of a DAESolver constructor with an inconsistent matrix pointer type.

This was referenced Oct 10, 2023
@wperkins
Copy link
Member Author

@abhyshr and @jacksavage, here is a current summary of unit test results as of revision f5e91bc on this branch. This is on my Ubuntu 20 system with

  • GCC 9.4
  • OpenMPI 4.0.3
  • Boost 1.71
  • PETSc 3.16.6

Debug build:

99% tests passed, 1 tests failed out of 113

Total Test time (real) = 190.00 sec

The following tests FAILED:
         71 - dynamic_simulation_full_y_145_bus_parallel (Failed)
Errors while running CTest

Test 71 fails with

71: malloc(): unsorted double linked list corrupted
71: [tlaloc:1632690] *** Process received signal ***
71: [tlaloc:1632690] Signal: Aborted (6)

which is one of those random memory corruption errors I've been complaining about.

Release build:

96% tests passed, 5 tests failed out of 113

Total Test time (real) = 131.47 sec

The following tests FAILED:
         71 - dynamic_simulation_full_y_145_bus_parallel (Failed)
         74 - dynamic_simulation_full_y_240_bus_serial (Failed)
         75 - dynamic_simulation_full_y_240_bus_parallel (Failed)
         76 - dynamic_simulation_full_y2_240_bus_serial (Failed)
         77 - dynamic_simulation_full_y2_240_bus_parallel (Failed)
Errors while running CTest

Test 71 fails as described as with the Debug build. The other tests all SEGV as discussed above.

@wperkins wperkins added this to the GridPACK 3.5 Release milestone Oct 12, 2023
@wperkins wperkins changed the title DRAFT: Fix non-working unit tests and smoke tests for new features. Fix non-working unit tests and smoke tests for new features. Oct 25, 2023
getAngle() method is messed up and it is NOT defined for REGCA1 model
@wperkins
Copy link
Member Author

This is ready to go.

[...]
Test 71 fails as described as with the Debug build. The other tests all SEGV as discussed above.

I don't think the remaining problems we are seeing with dynamic simulation smoke tests are our fault. See my comment on #175.

@wperkins
Copy link
Member Author

This PR resolves #116, resolves #117, closes #118, closes #160, resolves #175

@abhyshr abhyshr merged commit 33c6d74 into develop Nov 22, 2023
@wperkins wperkins deleted the fix/testing branch November 29, 2023 13:53
wperkins added a commit that referenced this pull request Dec 13, 2023
* Fatal error if `pkg-config` is not found (#161).

* Remove unnecessary CMake find package files (left over from #182)

* Fix input data files using old PETSc options (left over from #164)

* Contingency analysis really should set `PETScPrefix` (#191)
bjpalmer pushed a commit that referenced this pull request Apr 12, 2024
* Remove GA-based dense matrix; adjust matrix transpose

* Remove all PETSC_VERSION usage.

* Update PETSc settings for tlaloc

* Add smoke test for 240-bus dynamic simulation

* Add a CMake macro to do the LU solver check and use it

* Add smoke test for 2-area dynamic simulation

* Add smoke test for two area case w/ renewables

* Remove math::transpose(Matrix<>, Matrix<>) because of PETSc transpose semantics

* Add smoke tests for dsf2.x (same as for dsf.x).

* Add smoke test for Wind DSA 9-bus case

* Fix typos in Wind DSA TAMU 2000 case

* Add a smoke test for Wind DSA TAMU 2000 case

* Remove ancient test from `applications/development/dynamic_simulation_reduced_y`

* Fix incorrect Matrix type Jacobian supplied to DAESolver constructor

* Update my build script

* Remove 2000-bus Wind-DSA smoke tests - they take too long

* Switch to a direct linear solver because this was diverging

* Fix memory crash for dsf application in Release mode.

getAngle() method is messed up and it is NOT defined for REGCA1 model

* Use gridpack::Environment instead of direct MPI/GA calls

* Use a newer PETSc w/o SuperLU_dist on tlaloc

* Fix parallel mapper test that failed with newer PETSc

---------

Co-authored-by: Shri <[email protected]>
bjpalmer pushed a commit that referenced this pull request Apr 12, 2024
* Remove GA-based dense matrix; adjust matrix transpose

* Remove all PETSC_VERSION usage.

* Update PETSc settings for tlaloc

* Add smoke test for 240-bus dynamic simulation

* Add a CMake macro to do the LU solver check and use it

* Add smoke test for 2-area dynamic simulation

* Add smoke test for two area case w/ renewables

* Remove math::transpose(Matrix<>, Matrix<>) because of PETSc transpose semantics

* Add smoke tests for dsf2.x (same as for dsf.x).

* Add smoke test for Wind DSA 9-bus case

* Fix typos in Wind DSA TAMU 2000 case

* Add a smoke test for Wind DSA TAMU 2000 case

* Remove ancient test from `applications/development/dynamic_simulation_reduced_y`

* Fix incorrect Matrix type Jacobian supplied to DAESolver constructor

* Update my build script

* Remove 2000-bus Wind-DSA smoke tests - they take too long

* Switch to a direct linear solver because this was diverging

* Fix memory crash for dsf application in Release mode.

getAngle() method is messed up and it is NOT defined for REGCA1 model

* Use gridpack::Environment instead of direct MPI/GA calls

* Use a newer PETSc w/o SuperLU_dist on tlaloc

* Fix parallel mapper test that failed with newer PETSc

---------

Co-authored-by: Shri <[email protected]>
bjpalmer added a commit that referenced this pull request Apr 12, 2024
* Initial checkin of PSS/E v34 export module

* Minor format change.

* Different module for impedence corrections is unecessary.

* This module is unecessary. V33 and V34 are the same for the impedence
correctioThis module is unecessary. V33 and V34 are the same for the impedence
correction.

* Corrected name on export_sys_switch file.

* Fixed export module for v34 PSS/E files.

* Add methods to export PSSE33 and PSSE34 from the hadrec application

* Added some error handling code to parsing.

* Fixed up parsing error for some transformer values.

* Fixed up a problem with branch names in the PSS/E v34 parser and export modules.

* Fix non-working unit tests and smoke tests for new features. (#164)

* Remove GA-based dense matrix; adjust matrix transpose

* Remove all PETSC_VERSION usage.

* Update PETSc settings for tlaloc

* Add smoke test for 240-bus dynamic simulation

* Add a CMake macro to do the LU solver check and use it

* Add smoke test for 2-area dynamic simulation

* Add smoke test for two area case w/ renewables

* Remove math::transpose(Matrix<>, Matrix<>) because of PETSc transpose semantics

* Add smoke tests for dsf2.x (same as for dsf.x).

* Add smoke test for Wind DSA 9-bus case

* Fix typos in Wind DSA TAMU 2000 case

* Add a smoke test for Wind DSA TAMU 2000 case

* Remove ancient test from `applications/development/dynamic_simulation_reduced_y`

* Fix incorrect Matrix type Jacobian supplied to DAESolver constructor

* Update my build script

* Remove 2000-bus Wind-DSA smoke tests - they take too long

* Switch to a direct linear solver because this was diverging

* Fix memory crash for dsf application in Release mode.

getAngle() method is messed up and it is NOT defined for REGCA1 model

* Use gridpack::Environment instead of direct MPI/GA calls

* Use a newer PETSc w/o SuperLU_dist on tlaloc

* Fix parallel mapper test that failed with newer PETSc

---------

Co-authored-by: Shri <[email protected]>

* Remove stale code (#182)

* Remove stale Fortran interface

* Remove Fortran interface documentation

* Remove optimization/expression code and documentation

* Remove sandbox code

* Remove stale debian packaging (#181)

* Remove dynamic_simulation, dynamic_simulation_implicit, and powerflow2 applications.

---------

Co-authored-by: Shri <[email protected]>

* Remove cmake-jedbrown and all related stuff (#180)

* Remove cmake-jedbrown submodule

* Stop using Jed Brown's find petsc stuff

* Cleaned up some unresolved code.

---------

Co-authored-by: Bruce J Palmer <[email protected]>
Co-authored-by: William Perkins <[email protected]>
Co-authored-by: Bruce J Palmer <[email protected]>
Co-authored-by: Bill <[email protected]>
Co-authored-by: Shri <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unit test failures
4 participants