Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

9202 9222 reset delete state python #9304

Merged
merged 11 commits into from
Mar 4, 2022

Conversation

jmarrec
Copy link
Contributor

@jmarrec jmarrec commented Mar 1, 2022

Pull request overview

Avoid a segfault when calling reset_state / delete_state from python

Pull Request Author

Add to this list or remove from it as applicable. This is a simple templated set of guidelines.

  • Title of PR should be user-synopsis style (clearly understandable in a standalone changelog context)
  • Label the PR with at least one of: Defect, Refactoring, NewFeature, Performance, and/or DoNoPublish
  • Pull requests that impact EnergyPlus code must also include unit tests to cover enhancement or defect repair
  • Author should provide a "walkthrough" of relevant code changes using a GitHub code review comment process
  • If any diffs are expected, author must demonstrate they are justified using plots and descriptions
  • If changes fix a defect, the fix should be demonstrated in plots and descriptions
  • If any defect files are updated to a more recent version, upload new versions here or on DevSupport
  • If IDD requires transition, transition source, rules, ExpandObjects, and IDFs must be updated, and add IDDChange label
  • If structural output changes, add to output rules file and add OutputChange label
  • If adding/removing any LaTeX docs or figures, update that document's CMakeLists file dependencies

Reviewer

This will not be exhaustively relevant to every PR.

  • Perform a Code Review on GitHub
  • If branch is behind develop, merge develop and build locally to check for side effects of the merge
  • If defect, verify by running develop branch and reproducing defect, then running PR and reproducing fix
  • If feature, test running new feature, try creative ways to break it
  • CI status: all green or justified
  • Check that performance is not impacted (CI Linux results include performance check)
  • Run Unit Test(s) locally
  • Check any new function arguments for performance impacts
  • Verify IDF naming conventions and styles, memos and notes and defaults
  • If new idf included, locally check the err file and other outputs

@jmarrec jmarrec added the Defect Includes code to repair a defect in EnergyPlus label Mar 1, 2022
@jmarrec jmarrec requested a review from Myoldmopar March 1, 2022 11:57
@jmarrec jmarrec self-assigned this Mar 1, 2022
@@ -243,6 +245,9 @@ struct PluginManagerData : BaseGlobalStruct
bool apiErrorFlag = false;
std::vector<std::string> const objectsToFind = {
"PythonPlugin:OutputVariable", "PythonPlugin:SearchPaths", "PythonPlugin:Instance", "PythonPlugin:Variables", "PythonPlugin:TrendVariable"};

bool eplusRunningViaPythonAPI = false;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is on PluginManagerData, only so I can forward it to to PluginManager when it is finally instantiated.

When stateNewPython is called, the std::unique_ptr<PluginManagement::PluginManager> pluginManager isn't initialized yet, that's done later by the SimulationManager.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. This isn't code that is commonly touched, so I'm not going to rigor over an extra variable, but I am a bit unsure why you can't just always use the one from state, and not have one on the plugin manager at all. I guess it's because the plugin manager dtor doesn't take a state argument...so there is no state to access, only local members.

@@ -209,6 +209,8 @@ namespace PluginManagement {
static int getLocationOfUserDefinedPlugin(EnergyPlusData &state, std::string const &_programName);
static void runSingleUserDefinedPlugin(EnergyPlusData &state, int index);
static bool anyUnexpectedPluginObjects(EnergyPlusData &state);

bool eplusRunningViaPythonAPI = false;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is on the PluginManager class

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, it seems this is so that the dtor has access to a copy of the flag.

Comment on lines 651 to 653
if (!this->eplusRunningViaPythonAPI) {
Py_FinalizeEx();
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really only so we do call Py_FinalizeEx in the PluginManager destructors when not running from a python api: it's to be safe (when running from C API or not from api at all), but I think we would get away with not doing it at all...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally think we could go without it entirely as well......but yeah, this is the safer option.

@@ -402,7 +402,7 @@ void PluginManager::setupOutputVariables([[maybe_unused]] EnergyPlusData &state)
#endif
}

PluginManager::PluginManager(EnergyPlusData &state)
PluginManager::PluginManager(EnergyPlusData &state) : eplusRunningViaPythonAPI(state.dataPluginManager->eplusRunningViaPythonAPI)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ctor initializes the boolean from the PluginManagerData, so we can access it in the Dtor

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's all making sense now. Nice.

Comment on lines +62 to +68
EnergyPlusState stateNewPython()
{
auto *state = new EnergyPlus::EnergyPlusData;
state->dataPluginManager->eplusRunningViaPythonAPI = true;
return reinterpret_cast<EnergyPlusState>(state);
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When running from Python, set the boolean to true.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice.

@@ -85,7 +85,7 @@ def new_state(self) -> c_void_p:

:return: A pointer to a new state object in memory
"""
return self.api.stateNew()
return self.api.stateNewPython()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

state_new in python calls stateNewPython... That's unfortunate to add another extern C method, but given the lack of support in C for default argument values...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's not a big deal to me at all. This is hidden within the API anyway, so all good.

add_test(NAME "API.TestRuntimeC" COMMAND TestAPI_Runtime_C -d "${TEST_DIR}" -w "${EPW_FILE}" -D "${IDF_FILE}")

set(TEST_DIR "${PROJECT_BINARY_DIR}/tst/api/runtimeresetpython")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added 4 tests:

  • stateDelete in both C and python
  • stateReset in both C and python

Without the fix the C ones pass, the python ones are exhibiting the segfault (tested ubuntu 20.04)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I always love TDD.

@jmarrec
Copy link
Contributor Author

jmarrec commented Mar 1, 2022

…t_state is called and running from the API.
**reverts 7107ac3 and does it differently**

I have to create a boolean on PluginManagerData just so I can forward it to PluginManager, because when stateNewPython is called, the `std::unique_ptr<PluginManagement::PluginManager> pluginManager` isn't initialized yet

And this is only so we do call PyFinalize_Ex in the PluginManager destructors when not running from a **python** api: it's to be safe, but I think we would get away with not doing it at all...
@jmarrec jmarrec force-pushed the 9202_9222_Reset_DeleteState_Python branch from ec2bd71 to 789682f Compare March 1, 2022 13:16
Copy link
Member

@Myoldmopar Myoldmopar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome and looks ready to go. I had to walk through it pretty carefully to understand, but I get it now. Adding the member variable to the plugin manager class itself allows the dtor to respond to it, where it couldn't access state before to check anything. All good! Thanks!

add_test(NAME "API.TestRuntimeC" COMMAND TestAPI_Runtime_C -d "${TEST_DIR}" -w "${EPW_FILE}" -D "${IDF_FILE}")

set(TEST_DIR "${PROJECT_BINARY_DIR}/tst/api/runtimeresetpython")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I always love TDD.

@@ -243,6 +245,9 @@ struct PluginManagerData : BaseGlobalStruct
bool apiErrorFlag = false;
std::vector<std::string> const objectsToFind = {
"PythonPlugin:OutputVariable", "PythonPlugin:SearchPaths", "PythonPlugin:Instance", "PythonPlugin:Variables", "PythonPlugin:TrendVariable"};

bool eplusRunningViaPythonAPI = false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. This isn't code that is commonly touched, so I'm not going to rigor over an extra variable, but I am a bit unsure why you can't just always use the one from state, and not have one on the plugin manager at all. I guess it's because the plugin manager dtor doesn't take a state argument...so there is no state to access, only local members.

@@ -209,6 +209,8 @@ namespace PluginManagement {
static int getLocationOfUserDefinedPlugin(EnergyPlusData &state, std::string const &_programName);
static void runSingleUserDefinedPlugin(EnergyPlusData &state, int index);
static bool anyUnexpectedPluginObjects(EnergyPlusData &state);

bool eplusRunningViaPythonAPI = false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, it seems this is so that the dtor has access to a copy of the flag.

@@ -402,7 +402,7 @@ void PluginManager::setupOutputVariables([[maybe_unused]] EnergyPlusData &state)
#endif
}

PluginManager::PluginManager(EnergyPlusData &state)
PluginManager::PluginManager(EnergyPlusData &state) : eplusRunningViaPythonAPI(state.dataPluginManager->eplusRunningViaPythonAPI)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's all making sense now. Nice.

Comment on lines 651 to 653
if (!this->eplusRunningViaPythonAPI) {
Py_FinalizeEx();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally think we could go without it entirely as well......but yeah, this is the safer option.

Comment on lines +62 to +68
EnergyPlusState stateNewPython()
{
auto *state = new EnergyPlus::EnergyPlusData;
state->dataPluginManager->eplusRunningViaPythonAPI = true;
return reinterpret_cast<EnergyPlusState>(state);
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice.

@@ -85,7 +85,7 @@ def new_state(self) -> c_void_p:

:return: A pointer to a new state object in memory
"""
return self.api.stateNew()
return self.api.stateNewPython()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's not a big deal to me at all. This is hidden within the API anyway, so all good.

print("EnergyPlus Failed!")
sys.exit(1)

print("MUTED E+ RUN 2 DONE")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome new test scripts!

@Myoldmopar
Copy link
Member

Looks like the two new C API tests failed on Linux. The Windows build message looks unrelated. Let me know if you want me to test it out on Linux or anything, @jmarrec. I assume when those are running fine this will merge. Thanks!

@Myoldmopar Myoldmopar added this to the EnergyPlus 22.1 milestone Mar 1, 2022
@jmarrec
Copy link
Contributor Author

jmarrec commented Mar 2, 2022

Looks like the two new C API tests failed on Linux. The Windows build message looks unrelated. Let me know if you want me to test it out on Linux or anything, @jmarrec. I assume when those are running fine this will merge. Thanks!

@Myoldmopar damn... Fatal Python error: Py_Initialize: can't initialize sys. I would bet this is related to the specific version of Python that the linux runner has versus what mine is. Do you know what Python is used on that runner? Is this an Ubuntu 18.04 with system python (= python 3.6)?

@mbadams5
Copy link
Contributor

mbadams5 commented Mar 2, 2022

@jmarrec, what version of python are you using locally? I wonder if this bug is the cause or not (it was fixed in 3.7 and up)? https://bugs.python.org/issue32849

@jmarrec
Copy link
Contributor Author

jmarrec commented Mar 2, 2022

While testing this I was testing using 3.9.7...

@jmarrec
Copy link
Contributor Author

jmarrec commented Mar 2, 2022

I built on Ubuntu 20.04 with 3.6.7, and indeed it crashes on the second PyInitialize_Ex (but Py_FinalizeEx is called like it was before...). I'm not sure what changed between develop and this, but I suspect develop would throw on this test as well. I'm building the develop+tests (no other changes right now)

Fatal Python error: Py_Initialize: can't initialize sys

Current thread 0x00007fffe51067c0 (most recent call first):
Process 225027 stopped
* thread #1, name = 'TestAPI_Runtime', stop reason = signal SIGABRT
    frame #0: 0x00007fffe59dc03b libc.so.6`raise + 203
libc.so.6`raise:
->  0x7fffe59dc03b <+203>: movq   0x108(%rsp), %rax
    0x7fffe59dc043 <+211>: xorq   %fs:0x28, %rax
    0x7fffe59dc04c <+220>: jne    0x7fffe59dc074            ; <+260>
    0x7fffe59dc04e <+222>: movl   %r8d, %eax
(lldb) bt
* thread #1, name = 'TestAPI_Runtime', stop reason = signal SIGABRT
  * frame #0: 0x00007fffe59dc03b libc.so.6`raise + 203
    frame #1: 0x00007fffe59bb859 libc.so.6`abort + 299
    frame #2: 0x00007fffe51e7ba2 libpython3.6m.so.1.0`Py_FatalError(msg=<unavailable>) at pylifecycle.c:1462:5
    frame #3: 0x00007fffe5302bd4 libpython3.6m.so.1.0`_Py_InitializeEx_Private at pylifecycle.c:401:9
    frame #4: 0x00007fffe5302a42 libpython3.6m.so.1.0`_Py_InitializeEx_Private(install_sigs=0, install_importlib=1) at pylifecycle.c:305
    frame #5: 0x00007fffec75c68e libenergyplusapi.so.22.1.0`EnergyPlus::PluginManagement::PluginManager::PluginManager(this=0x0000555556c4f450, state=0x0000555556be01b0) at PluginManager.cc:437:24
    frame #6: 0x00007fffecd5157e libenergyplusapi.so.22.1.0`std::_MakeUniq<EnergyPlus::PluginManagement::PluginManager>::__single_object std::make_unique<EnergyPlus::PluginManagement::PluginManager, EnergyPlus::EnergyPlusData&>((null)=0x0000555556be01b0) at unique_ptr.h:962:30
    frame #7: 0x00007fffecd24cc6 libenergyplusapi.so.22.1.0`EnergyPlus::SimulationManager::ManageSimulation(state=0x0000555556be01b0) at SimulationManager.cc:287:111
    frame #8: 0x00007fffeb4c2399 libenergyplusapi.so.22.1.0`runEnergyPlusAsLibrary(state=0x0000555556be01b0, argc=7, argv=0x00007fffffffcc38) at EnergyPlusPgm.cc:445:56
    frame #9: 0x00007fffeb1f23a1 libenergyplusapi.so.22.1.0`::energyplus(state=0x0000555556be01b0, argc=7, argv=0x00007fffffffcc38) at runtime.cc:75:34
    frame #10: 0x000055555555670c TestAPI_RuntimeDeleteState_C`main(argc=7, argv=0x00007fffffffcc38) at TestRuntimeDeleteState.c:83:5
    frame #11: 0x00007fffe59bd0b3 libc.so.6`__libc_start_main + 243
    frame #12: 0x00005555555564de TestAPI_RuntimeDeleteState_C`_start + 46
(lldb) fr select 5
frame #5: 0x00007fffec75c68e libenergyplusapi.so.22.1.0`EnergyPlus::PluginManagement::PluginManager::PluginManager(this=0x0000555556c4f450, state=0x0000555556be01b0) at PluginManager.cc:437:24
   434 	    // If arg 0, it skips init registration of signal handlers, which might be useful when Python is embedded.
   435 	    bool alreadyInitialized = (Py_IsInitialized() != 0);
   436 	    if (!alreadyInitialized) {
-> 437 	        Py_InitializeEx(0);
   438 	    }
   439 	
   440 	    // Take control of the global interpreter lock while we are here, make sure to release it...
(lldb) 

@jmarrec
Copy link
Contributor Author

jmarrec commented Mar 2, 2022

yup, if you take the C tests only (no other changes) and run them with develop with 3.6.7: they throw...

@Myoldmopar now that we pin the python version on github (to 3.8 currently), can we just get the decent_ci runners to use the same version?

@Myoldmopar
Copy link
Member

Or we could just include the CPython source and always build the exact same version every single time.

@Myoldmopar
Copy link
Member

Just kidding, but that is indeed part of the reason why I wanted to do that in the first place. Anyway, yes, we can do that. I'll need to determine how we'll make sure Decent CI uses the correct version when it builds.

@jmarrec
Copy link
Contributor Author

jmarrec commented Mar 2, 2022

@Myoldmopar pass -DPYTHON_VERSION:STRING=3.8 -DPython_ROOT_DIR:PATH=/path/to/python_root/ eg -DPython_ROOT_DIR:PATH=~/.pyenv/versions/3.8.12/ (under which you have the include/ lib/ bin/ subfolders)

If you use pyenv, make sure you pass --enable-shared, eg PYTHON_CONFIGURE_OPTS="--enable-shared" pyenv install 3.8.12 or when linking E+ you'll get

relocation R_X86_64_PC32 against symbol `_PyRuntime' can not be used when making a shared object; recompile with -fPIC

Comment on lines +168 to +169
if(Python_REQUIRED_VERSION)
find_package(Python ${Python_REQUIRED_VERSION} EXACT COMPONENTS Interpreter Development REQUIRED)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need to expose PYTHON_INCLUDE_DIR, PYTHON_LIBRARY, and PYTHON_LIBRARY_DEBUG or are those no longer needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mitchute Those are artifacts from when the deprecated find_package(PythonInterp) and find_package(PythonLibs) were used. You don't see them on a clean build.

cf https://cmake.org/cmake/help/latest/module/FindPythonLibs.html

@Myoldmopar
Copy link
Member

Now we're talking, look at that green! I assume this will merge shortly once the integration tests complete. (They were already happy before anyway...)

mark_as_advanced(Python_REQUIRED_VERSION)
mark_as_advanced(Python_ROOT_DIR)
if(Python_REQUIRED_VERSION)
find_package(Python ${Python_REQUIRED_VERSION} EXACT COMPONENTS Interpreter Development REQUIRED)
else()
find_package(Python 3.6 COMPONENTS Interpreter Development REQUIRED)
Copy link
Contributor

@mbadams5 mbadams5 Mar 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only comment is that this should probably be 3.7 and not 3.6 now since Python 3.6 reached End of Life Dec 2021 (https://endoflife.date/python). But this doesn't have to change in this PR necessarily.

Copy link
Contributor Author

@jmarrec jmarrec Mar 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be annoying for people on Ubuntu 18 04 (or Gort, until yesterday) as the system python is 3.6. I personally dont care, I always use pyenv and much more recent pythons

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mbadams5 I do hear you there. Once Ubuntu 22.04 is released next month I plan to reimage our CI machines up to 20.04. That would still be one full LTS behind, which seems pretty safe. In that case, I'd be fully in support of going to 3.7 here. Let's chat about that soon.

@Myoldmopar
Copy link
Member

Alright, aside from @mbadams5 's comment, which will be brought back up soon, this is ready to go. Thanks for these fixes @jmarrec !!! Merging this one in.

@Myoldmopar Myoldmopar merged commit 72e23f2 into develop Mar 4, 2022
@Myoldmopar Myoldmopar deleted the 9202_9222_Reset_DeleteState_Python branch March 4, 2022 19:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Defect Includes code to repair a defect in EnergyPlus
Projects
None yet
Development

Successfully merging this pull request may close these issues.

delete_state() causing crashes in Linux reset_state does not work
9 participants