Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to run model in multiprocess or parallel #300

Closed
intelligent-222 opened this issue Aug 24, 2020 · 37 comments
Closed

Unable to run model in multiprocess or parallel #300

intelligent-222 opened this issue Aug 24, 2020 · 37 comments

Comments

@intelligent-222
Copy link

Hi,

I have problem with running EPPY and multiprocess.Pool in parallel. I have an objective function that I want to minimize and I need to run an idf for 100 times to find a mean that I will use in my objective function. My problem is, when I tried multiprocess. Pool the code gave me an error related to ep_version and when I append ep_version to my (idf, epw) list I got error that list objective does not have run function.

I went through the “def test_multiprocess_run(self):” function and I have replicated it with my idf and it works properly. However, it didn’t provide csv file in output folder. Also, it provides me with multiple folder for each run that I don’t need them since in each run I want to read a variable and store it in predefined list. Could you please help me to figure out this problem. I have been looking for the solution past 5 days and I have tried every single potential solution.
My objective function is:
def f(X):
w = X[0]
fname1 = "/MyidfFile.idf"
epwfile = "/MyepwFile.epw"
idf = IDF(fname1,epwfile)

Here I have a few line of codes that make some changes in my idf and save as it as new idf

idf.saveas("/MyidfFile.idf")
fname1 = "/MyidfFile.idf"
epwfile = "/MyepwFile.epw"
idf = IDF(fname1,epwfile)

MyObjectList=[]

for i in range (100):
    idf.run(expandobjects=True,readvars=True)
    Data=pd.read_csv('eplusout.csv')
    MyTargerVar=Data['My Target Var Column Name'].sum()
    MyObjectList.append(MyTargerVar)
    
return np.sum(MyObjectList)

I want to run the loop code in parallel by implementing multiprocess.Pool or any parallel run. So, I separate the loop part as a new function and tried to call it in my objective function using Pool, or parallel run packages.
def EPRUN(I):
idf = I
idf.run(expandobjects=True,readvars=True)
Data=pd.read_csv('eplusout.csv')
MyTargerVar=Data['My Target Var Column Name'].sum()
return MyTargerVar
At first attempt, I created a list IDFS with 100 elements all equal to my “idf = IDF(fname1,epwfile)”used the following package:
num_cores = multiprocessing.cpu_count()
TRELC = Parallel(n_jobs=num_cores)(delayed(EPRUN)(IDFS) for IDF in IDFS)

Error : ep_version should be define

So, I have reviewed Sntshphilp post related to test runner and used the same approach to define my list of IDFs:
IDFS = []
ep_version = '-'.join(str(x) for x in modeleditor.IDF.idd_version[:3])
VERSION = '9-2-0'
assert ep_version == VERSION
for i in range(100):
kwargs = {'output_directory': 'results_%s' % i,
'ep_version': ep_version}
IDFS.append([[fname1, epwfile], kwargs])

I have tested it as separate function without EPRUN and by using multirunner (similar to the test_runner code) and it works but it gave me 100 folders of results of each run without csv file that I have needed to extract my information related to my target variable. And, when I used it to run my EPRUN function it gave me an error that the “run” function has not been defined for object which I assume is related to the new list “IDFS” that I have defined.

I went through the document section 6.1 “Running in parallel processes” and it mentioned something related to parallel run and it seems the should be an example that I couldn’t find there.
“You first need to create your jobs as a list of lists in the form:
[[, ], ...]
The example here just creates 4 identical jobs apart from the output_directory the results are saved in, but you would obviously want to make each job different.
Then run the jobs on the required number of CPUs using runIDFs. . .”
I have tried to implement “runIDFs” and it doesn’t work as well.
I am pretty sure that there is a trick here that I have missed and I haven’t found it so far. Could anyone please help me to solve this issue. Any help would be greatly appreciated.

In summary, I need to run my EPRUN function for 100 times with one idf in parallel or multiprocess and pass the results to my objective function for further analysis.

@santoshphilip
Copy link
Owner

I think I understand what you are trying to do.
Let me take a closer look.

@santoshphilip
Copy link
Owner

If you are online now, stay online. I might have some questions as I look thru this (I am looking thru now)

@intelligent-222
Copy link
Author

Sure, thank you

@santoshphilip
Copy link
Owner

There may be a quick way round this.

Take a look at the package zeppy https://github.com/pyenergyplus/zeppy
Look at the documentation at https://zeppy.readthedocs.io/en/latest/
Specifically look at the following pages

I wrote this at the start of the covid lockdown. See if it makes sense and check if it works for you.
It may be easier for me to help you solve your issue using zeppy, since the code is still fresh in my mind.

What time zone are in ?

@santoshphilip
Copy link
Owner

I am asking about the time zone, since I am in Pacific time and it is late here

@intelligent-222
Copy link
Author

EST. Canada. Be honest with you I have reviewed even second page of google search to find the potential solution and I also tried zeppy. let me try it again but I believe there was an error related to identifying parallel pipe command.

meanwhile, could you please let me know if it is possible to get csv file as output from multirunner command. worst case scenario, I can use that function to create 100 csv file and extract my data from those.

Best,

@santoshphilip
Copy link
Owner

OK it is far later in the night for you.

I need to go to bed now.
I'll take a look at this in the morning and see how to get the csv files.

This has to work

@intelligent-222
Copy link
Author

Great, thank you and appreciate your time and consideration. I am going to give a try to zeppy again to see if it runs on my notebook.

Stay safe and good night,

@santoshphilip
Copy link
Owner

Good Night.

If zeppy does not work - don't try too hard.
(better if I make it work - there is some magic going on behind the scenes in zeppy :-)
Zeppy is very much an alpha version software.

Can you send me

  • the IDF file
  • link to weather file
  • what you are changing in the IDF file (the script that does the changing)
  • I know what you are extracting from the csv (from you initial post)
  • how many cores are in your machine

I can get rolling once I have my coffee in the morning.
I want to fix this quick, since I got a week full of deadlines :-(

@santoshphilip
Copy link
Owner

The zeppy documentation is made from a notebook
see
zeppy/docs/tutorial_docs/

and it worked

@intelligent-222
Copy link
Author

my first attempt using zeppy.zmq_parallelpipe gave me the following error:
module 'zeppy' has no attribute 'zmq_parallelpipe'

and my second attempt using ppipes.ipc_parallelpipe(EPRUN, IDFS, nworkers=4) gave me
ZMQError: Protocol not supported

@intelligent-222
Copy link
Author

Regarding to your request:

the IDF file : how can I send it to you privately not as public post?
link to weather file : Same as above?
what you are changing in the IDF file (the script that does the changing): to make it easier let's assume I am not changing anything and I just want to run one idf for 100 times in parallel and extract specific information from each run through the csv results.
I know what you are extracting from the csv (from you initial post)
how many cores are in your machine : based on "print("Number of cpu : ", multiprocessing.cpu_count())" my machine has 4 cores

@intelligent-222
Copy link
Author

intelligent-222 commented Aug 24, 2020

Here is my last try to replicate your code in eplus_zeppy.ipynb

zeppy

ZMQError: Protocol not supported

@santoshphilip
Copy link
Owner

see if this code works for you (it worked on my machine)
I'll check my messages during the day.
So let me know if it works

You may have to change pathnames for the files

"""multiprocessing runs"""


import os 
from eppy.modeleditor import IDF
from eppy.runner.run_functions import runIDFs

def make_options(idf):
    idfversion = idf.idfobjects['version'][0].Version_Identifier.split('.')
    idfversion.extend([0] * (3 - len(idfversion)))
    idfversionstr = '-'.join([str(item) for item in idfversion])
    fname = idf.idfname
    options = {
        'ep_version':idfversionstr,
        'output_prefix':os.path.basename(fname).split('.')[0],
        'output_suffix':'C',
        'output_directory':os.path.dirname(fname),
        'readvars':True,
        'expandobjects':True
        }
    return options




def main():
    iddfile = "/Applications/EnergyPlus-9-3-0/Energy+.idd"
    IDF.setiddname(iddfile)
    epwfile = "../temp/eplusfiles/weather/USA_CO_Denver/USA_CO_Denver.Intl.AP.725650_TMY3.epw"


    runs = []

    # File is from the Examples Folder
    idfname = "../temp/eplusfiles/HVACTemplate-5ZoneBaseboardHeat.idf"
    idf = IDF(idfname, epwfile)
    theoptions = make_options(idf) 
    ep_version = theoptions["ep_version"]
    # i = 1
    runs.append([idf, theoptions])

    # copy of previous file
    idfname = "../temp/eplusfiles/HVACTemplate-5ZoneBaseboardHeat1.idf"
    idf = IDF(idfname, epwfile)
    theoptions = make_options(idf)
    ep_version = theoptions["ep_version"]
    runs.append([idf, theoptions])

    num_CPUs = 2
    runIDFs(runs, num_CPUs)
    # idf.run(**theoptions)

if __name__ == '__main__':
    main()
    # make sure you run it with if __name__
    # weird shit happens if you don't

Some notes to myself:

  • This took a long time to get working
  • I need to update the documentation to give clarity on how to do this
  • open this as a new issue

@intelligent-222
Copy link
Author

Thank you Santoshphilip. I changed the path and run the file as follow. there is an error that I couldn't figure out what it that. Also, I assigned same idf to my run list instead of two different.

"""multiprocessing runs"""

import os
from eppy.modeleditor import IDF
from eppy.runner.run_functions import runIDFs

def make_options(idf):
idfversion = idf.idfobjects['version'][0].Version_Identifier.split('.')
idfversion.extend([0] * (3 - len(idfversion)))
idfversionstr = '-'.join([str(item) for item in idfversion])
fname = idf.idfname
options = {
'ep_version':idfversionstr,
'output_prefix':os.path.basename(fname).split('.')[0],
'output_suffix':'C',
'output_directory':os.path.dirname(fname),
'readvars':True,
'expandobjects':True
}
return options

def main():
iddfile = "/EnergyPlusV9-2-0/Energy+.idd"
IDF.setiddname(iddfile)
epwfile = "/Users/EPPY/GA/CAN_PQ_Montreal.Intl.AP.716270_CWEC.epw"

runs = []

# File is from the Examples Folder
idfname =  "/Users/EPPY/GA/practice1.idf"
idf = IDF(idfname, epwfile)
theoptions = make_options(idf) 
ep_version = theoptions["ep_version"]
# i = 1
runs.append([idf, theoptions])

# copy of previous file
idfname =  "/Users/EPPY/GA/practice1.idf"
idf = IDF(idfname, epwfile)
theoptions = make_options(idf)
ep_version = theoptions["ep_version"]
runs.append([idf, theoptions])

num_CPUs = 2
runIDFs(runs, num_CPUs)
# idf.run(**theoptions)

if name == 'main':
main()
# make sure you run it with if name
# weird shit happens if you don't

@intelligent-222
Copy link
Author

Error:

RemoteTraceback Traceback (most recent call last)
RemoteTraceback:
"""
Traceback (most recent call last):
File "c:\users\anaconda3\envs\elective\lib\site-packages\eppy\runner\run_functions.py", line 357, in run
check_call(cmd)
File "c:\users\anaconda3\envs\elective\lib\subprocess.py", line 363, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['C:/EnergyPlusV9-2-0\energyplus.exe', '--weather', 'C:\Users\EPPY\GA\CAN_PQ_Montreal.Intl.AP.716270_CWEC.epw', '--output-directory', 'C:\Users\EPPY\GA', '--expandobjects', '--readvars', '--output-prefix', 'practice1', '--output-suffix', 'C', 'C:\Users\elective\multi_runs\idf_1\in.idf']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "c:\users\ anaconda3\envs\elective\lib\multiprocessing\pool.py", line 121, in worker
result = (True, func(*args, **kwds))
File "c:\users\anaconda3\envs\elective\lib\multiprocessing\pool.py", line 44, in mapstar
return list(map(*args))
File "c:\users\anaconda3\envs\elective\lib\site-packages\eppy\runner\run_functions.py", line 205, in multirunner
run(*args[0], **args[1])
File "c:\users\anaconda3\envs\elective\lib\site-packages\eppy\runner\run_functions.py", line 362, in run
raise EnergyPlusRunError(message)
eppy.runner.run_functions.EnergyPlusRunError:

Contents of EnergyPlus error file at C:\Users\EPPY\GA\eplusout.err
Program Version,EnergyPlus, Version 9.2.0-921312fa1d, YMD=2020.08.23 13:43,
************* Testing Individual Branch Integrity
************* All Branches passed integrity testing
************* Testing Individual Supply Air Path Integrity
************* All Supply Air Paths passed integrity testing
************* Testing Individual Return Air Path Integrity
************* All Return Air Paths passed integrity testing
************* No node connection errors were found.
************* Beginning Simulation
************* Simulation Error Summary *************
************* There are 6 unused schedules in input.
************* There are 4 unused week schedules in input.
************* There are 4 unused day schedules in input.
************* Use Output:Diagnostics,DisplayUnusedSchedules; to see them.
************* EnergyPlus Warmup Error Summary. During Warmup: 0 Warning; 0 Severe Errors.
************* EnergyPlus Sizing Error Summary. During Sizing: 0 Warning; 0 Severe Errors.
************* EnergyPlus Completed Successfully-- 0 Warning; 0 Severe Errors; Elapsed Time=00hr 00min 29.52sec

"""

The above exception was the direct cause of the following exception:

EnergyPlusRunError Traceback (most recent call last)
in
52
53 if name == 'main':
---> 54 main()
55 # make sure you run it with if name
56 # weird shit happens if you don't

in main()
48
49 num_CPUs = 2
---> 50 runIDFs(runs, num_CPUs)
51 # idf.run(**theoptions)
52

c:\users\anaconda3\envs\elective\lib\site-packages\eppy\runner\run_functions.py in runIDFs(jobs, processors)
169 try:
170 pool = mp.Pool(processors)
--> 171 pool.map(multirunner, prepared_runs)
172 pool.close()
173 except NameError:

c:\users\anaconda3\envs\elective\lib\multiprocessing\pool.py in map(self, func, iterable, chunksize)
266 in a list that is returned.
267 '''
--> 268 return self._map_async(func, iterable, mapstar, chunksize).get()
269
270 def starmap(self, func, iterable, chunksize=None):

c:\users\anaconda3\envs\elective\lib\multiprocessing\pool.py in get(self, timeout)
655 return self._value
656 else:
--> 657 raise self._value
658
659 def _set(self, i, obj):

EnergyPlusRunError:

Contents of EnergyPlus error file at C:\Users\EPPY\GA\eplusout.err
Program Version,EnergyPlus, Version 9.2.0-921312fa1d, YMD=2020.08.23 13:43,
************* Testing Individual Branch Integrity
************* All Branches passed integrity testing
************* Testing Individual Supply Air Path Integrity
************* All Supply Air Paths passed integrity testing
************* Testing Individual Return Air Path Integrity
************* All Return Air Paths passed integrity testing
************* No node connection errors were found.
************* Beginning Simulation
************* Simulation Error Summary *************
************* There are 6 unused schedules in input.
************* There are 4 unused week schedules in input.
************* There are 4 unused day schedules in input.
************* Use Output:Diagnostics,DisplayUnusedSchedules; to see them.
************* EnergyPlus Warmup Error Summary. During Warmup: 0 Warning; 0 Severe Errors.
************* EnergyPlus Sizing Error Summary. During Sizing: 0 Warning; 0 Severe Errors.
************* EnergyPlus Completed Successfully-- 0 Warning; 0 Severe Errors; Elapsed Time=00hr 00min 29.52sec

@intelligent-222
Copy link
Author

I believe the problem is related to the issue that windows and notebook have with multiprocessing. please refer to the following link:
stackoverflow.com/questions/47313732/jupyter-notebook-never-finishes-processing-using-multiprocessing-python-3

jupyter/notebook#1703

@santoshphilip
Copy link
Owner

Can you test it outside of jupyter notebook ?

python  script.py

@santoshphilip
Copy link
Owner

I ran it successfully in jupyter notebook

remove

if __name__ == '__main__':
    main()

and just run

main()

@santoshphilip
Copy link
Owner

my jupyter notebook terminal looked like this:

(eppy3) santoshphilip@Santoshs-MacBook-Air eppy % jupyter notebook
[I 13:26:08.839 NotebookApp] Serving notebooks from local directory: /Users/santoshphilip/Documents/coolshadow/github/eppy
[I 13:26:08.840 NotebookApp] Jupyter Notebook 6.1.3 is running at:
[I 13:26:08.840 NotebookApp] http://localhost:8888/?token=fbb68106aa772170ebdafbfd9593de096bd0b02fcb8cccc2
[I 13:26:08.840 NotebookApp]  or http://127.0.0.1:8888/?token=fbb68106aa772170ebdafbfd9593de096bd0b02fcb8cccc2
[I 13:26:08.840 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 13:26:08.852 NotebookApp] 

    To access the notebook, open this file in a browser:
        file:///Users/santoshphilip/Library/Jupyter/runtime/nbserver-16613-open.html
    Or copy and paste one of these URLs:
        http://localhost:8888/?token=fbb68106aa772170ebdafbfd9593de096bd0b02fcb8cccc2
     or http://127.0.0.1:8888/?token=fbb68106aa772170ebdafbfd9593de096bd0b02fcb8cccc2
[I 13:26:15.579 NotebookApp] Kernel started: 2c5d1898-c082-47f0-a8fa-ddc1fee0509c, name: python3
[W 13:26:15.626 NotebookApp] 404 GET /nbextensions/splitcell/splitcell.js?v=20200824132608 (::1) 37.48ms referer=http://localhost:8888/notebooks/temp3.ipynb

/Applications/EnergyPlus-9-3-0/energyplus --weather /Users/santoshphilip/Documents/coolshadow/github/temp/eplusfiles/weather/USA_CO_Denver/USA_CO_Denver.Intl.AP.725650_TMY3.epw --output-directory /Users/santoshphilip/Documents/coolshadow/github/temp/eplusfiles --expandobjects --readvars --output-prefix HVACTemplate-5ZoneBaseboardHeat --output-suffix C /Users/santoshphilip/Documents/coolshadow/github/eppy/multi_runs/idf_0/in.idf


/Applications/EnergyPlus-9-3-0/energyplus --weather /Users/santoshphilip/Documents/coolshadow/github/temp/eplusfiles/weather/USA_CO_Denver/USA_CO_Denver.Intl.AP.725650_TMY3.epw --output-directory /Users/santoshphilip/Documents/coolshadow/github/temp/eplusfiles --expandobjects --readvars --output-prefix HVACTemplate-5ZoneBaseboardHeat1 --output-suffix C /Users/santoshphilip/Documents/coolshadow/github/eppy/multi_runs/idf_1/in.idf

ExpandObjects Started.
ExpandObjects Started.
 Begin reading Energy+.idd file.
 Begin reading Energy+.idd file.
 Done reading Energy+.idd file.
 Done reading Energy+.idd file.
ExpandObjects Finished. Time:     0.056
ExpandObjects Finished. Time:     0.056
EnergyPlus Starting
EnergyPlus, Version 9.3.0-baff08990c, YMD=2020.08.24 13:26
EnergyPlus Starting
EnergyPlus, Version 9.3.0-baff08990c, YMD=2020.08.24 13:26
Could not find platform independent libraries <prefix>
Could not find platform dependent libraries <exec_prefix>
Consider setting $PYTHONHOME to <prefix>[:<exec_prefix>]
Could not find platform independent libraries <prefix>
Could not find platform dependent libraries <exec_prefix>
Consider setting $PYTHONHOME to <prefix>[:<exec_prefix>]
Initializing Response Factors
Initializing Response Factors
Calculating CTFs for "ROOF-1", Construction # 1
Calculating CTFs for "ROOF-1", Construction # 1
Calculating CTFs for "WALL-1", Construction # 2
Calculating CTFs for "WALL-1", Construction # 2
Calculating CTFs for "FLOOR-SLAB-1", Construction # 4
Calculating CTFs for "INT-WALL-1", Construction # 5
Calculating CTFs for "FLOOR-SLAB-1", Construction # 4
Calculating CTFs for "INT-WALL-1", Construction # 5
Initializing Window Optical Properties
Initializing Window Optical Properties
Initializing Solar Calculations
Initializing Solar Calculations
Allocate Solar Module Arrays
Allocate Solar Module Arrays
Initializing Zone and Enclosure Report Variables
Initializing Zone and Enclosure Report Variables
Initializing Surface (Shading) Report Variables
Initializing Surface (Shading) Report Variables
Computing Interior Solar Absorption Factors
Determining Shadowing Combinations
Computing Interior Solar Absorption Factors
Determining Shadowing Combinations
Computing Window Shade Absorption Factors
Proceeding with Initializing Solar Calculations
Computing Window Shade Absorption Factors
Proceeding with Initializing Solar Calculations
Initializing Surfaces
Initializing Outdoor environment for Surfaces
Initializing Surfaces
Initializing Outdoor environment for Surfaces
Setting up Surface Reporting Variables
Setting up Surface Reporting Variables
Initializing Temperature and Flux Histories
Initializing Window Shading
Computing Interior Absorption Factors
Computing Interior Diffuse Solar Absorption Factors
Computing Interior Diffuse Solar Exchange through Interzone Windows
Initializing Solar Heat Gains
Initializing Internal Heat Gains
Initializing Interior Solar Distribution
Initializing Interior Convection Coefficients
Gathering Information for Predefined Reporting
Initializing Temperature and Flux Histories
Initializing Window Shading
Computing Interior Absorption Factors
Computing Interior Diffuse Solar Absorption Factors
Computing Interior Diffuse Solar Exchange through Interzone Windows
Initializing Solar Heat Gains
Initializing Internal Heat Gains
Completed Initializing Surface Heat Balance
Calculate Outside Surface Heat Balance
Calculate Inside Surface Heat Balance
Calculate Air Heat Balance
Initializing Interior Solar Distribution
Initializing Interior Convection Coefficients
Gathering Information for Predefined Reporting
Completed Initializing Surface Heat Balance
Calculate Outside Surface Heat Balance
Calculate Inside Surface Heat Balance
Calculate Air Heat Balance
Initializing HVAC
Initializing HVAC
Warming up
Warming up
Warming up
Warming up
Warming up
Warming up
Warming up
Warming up
Warming up
Warming up
Warming up
Warming up
Performing Zone Sizing Simulation
...for Sizing Period: #1 CHICAGO_IL_USA ANNUAL HEATING 99% DESIGN CONDITIONS DB
Performing Zone Sizing Simulation
...for Sizing Period: #1 CHICAGO_IL_USA ANNUAL HEATING 99% DESIGN CONDITIONS DB
Warming up
Warming up
Warming up
Warming up
Warming up
Warming up
Warming up
Warming up
Warming up
Warming up
Warming up
Warming up
Performing Zone Sizing Simulation
...for Sizing Period: #2 CHICAGO_IL_USA ANNUAL COOLING 1% DESIGN CONDITIONS DB/MCWB
Performing Zone Sizing Simulation
...for Sizing Period: #2 CHICAGO_IL_USA ANNUAL COOLING 1% DESIGN CONDITIONS DB/MCWB
Adjusting Air System Sizing
Adjusting Standard 62.1 Ventilation Sizing
Initializing Simulation
Adjusting Air System Sizing
Adjusting Standard 62.1 Ventilation Sizing
Initializing Simulation
Reporting Surfaces
Reporting Surfaces
Beginning Primary Simulation
Beginning Primary Simulation
Initializing New Environment Parameters
Warming up {1}
Initializing New Environment Parameters
Warming up {1}
Warming up {2}
Warming up {2}
Warming up {3}
Warming up {3}
Warming up {4}
Warming up {4}
Warming up {5}
Warming up {5}
Warming up {6}
Warming up {6}
Starting Simulation at 01/14/2014 for RUN PERIOD 1
Starting Simulation at 01/14/2014 for RUN PERIOD 1
Initializing New Environment Parameters
Warming up {1}
Initializing New Environment Parameters
Warming up {1}
Warming up {2}
Warming up {2}
Warming up {3}
Warming up {3}
Warming up {4}
Warming up {4}
Warming up {5}
Warming up {5}
Warming up {6}
Warming up {6}
Starting Simulation at 07/07/2015 for RUN PERIOD 2
Starting Simulation at 07/07/2015 for RUN PERIOD 2
Writing tabular output file results using HTML format.
Writing tabular output file results using HTML format.
Writing final SQL reports
Writing final SQL reports
 ReadVarsESO program starting.
 ReadVarsESO program starting.
 ReadVars Run Time=00hr 00min  0.04sec
 ReadVarsESO program completed successfully.
 ReadVars Run Time=00hr 00min  0.05sec
 ReadVarsESO program completed successfully.
 ReadVarsESO program starting.
 ReadVarsESO program starting.
 ReadVars Run Time=00hr 00min  0.03sec
 ReadVarsESO program completed successfully.
EnergyPlus Run Time=00hr 00min  1.67sec
EnergyPlus Completed Successfully.
 ReadVars Run Time=00hr 00min  0.03sec
 ReadVarsESO program completed successfully.
EnergyPlus Run Time=00hr 00min  1.69sec
EnergyPlus Completed Successfully.

@santoshphilip
Copy link
Owner

You won't see the output in the notebook, since it is running in a separate process

@santoshphilip
Copy link
Owner

If you are running 100 files, you have to use generators

"""multiprocessing runs"""

# using generators instead of a list
# when you are running a 100 files you have to use generators

import os 
from eppy.modeleditor import IDF
from eppy.runner.run_functions import runIDFs

def make_options(idf):
    idfversion = idf.idfobjects['version'][0].Version_Identifier.split('.')
    idfversion.extend([0] * (3 - len(idfversion)))
    idfversionstr = '-'.join([str(item) for item in idfversion])
    fname = idf.idfname
    options = {
        'ep_version':idfversionstr,
        'output_prefix':os.path.basename(fname).split('.')[0],
        'output_suffix':'C',
        'output_directory':os.path.dirname(fname),
        'readvars':True,
        'expandobjects':True
        }
    return options




def main():
    iddfile = "/Applications/EnergyPlus-9-3-0/Energy+.idd"
    IDF.setiddname(iddfile)
    epwfile = "../temp/eplusfiles/weather/USA_CO_Denver/USA_CO_Denver.Intl.AP.725650_TMY3.epw"



    # File is from the Examples Folder
    idfname1 = "../temp/eplusfiles/HVACTemplate-5ZoneBaseboardHeat.idf"
    # copy of previous file
    idfname2 = "../temp/eplusfiles/HVACTemplate-5ZoneBaseboardHeat1.idf"


    fnames = [idfname1, idfname1]
    idfs = (IDF(fname, epwfile) for fname in fnames)
    runs = ((idf, make_options(idf) ) for idf in idfs)


    num_CPUs = 2
    runIDFs(runs, num_CPUs)

if __name__ == '__main__':
    main()

@intelligent-222
Copy link
Author

I am a bit confused and trying to digest the process. I ran the file as you mentioned by just running main() and same error came up.
main

let me see if I can convert the notebook into script and run it from command line.

@intelligent-222
Copy link
Author

Also, I want to open the csv file and extract some data from each run and append them to the target list.

@santoshphilip
Copy link
Owner

while fixing this issue, I just fixed #296
This is a long shot.
Take look and see my last comment there (I am grasping at straws here)

@intelligent-222
Copy link
Author

I am wondering why my notebook can run the following code without any error and create 8 folders with results(except csv file). it seems that the process and functions are similar. the only thing is that regarding muptiprocessing issue with notebook I have imported multiprocess package and used Pool function from that package and it works well. Is it possible to use multiprocess instead of multiprocessing package in the code that you have provided. also is it possible to add csv to the output of multirunner function
multirunner

import time

from multiprocess import Pool
starttime = time.time()
runs = []
ep_version = '-'.join(str(x) for x in modeleditor.IDF.idd_version[:3])
#VERSION = os.environ["ENERGYPLUS_INSTALL_VERSION"] # used in CI files
VERSION = '9-2-0'
assert ep_version == VERSION
for i in range(8):
kwargs = {'output_directory': 'results_%s' % i,
'ep_version': ep_version}
runs.append([[fname1, epwfile], kwargs])
pool = Pool(4)
pool.map(multirunner, runs)
pool.close()
print('Time taken = {} seconds'.format(time.time() - starttime))

@santoshphilip
Copy link
Owner

What is curious is that my notebook runs the code successfully and yours does not.
I have the latest jupyter notebook installed Jupyter Notebook 6.1.3

What is your install ?

Right now I am feeling confident that the code is working fine.
The question is "why is not running on your machine ?"
I am a little reluctant to make a fix specific to your machine
Can you test on another machine
also test it as a script, without notebook (first priority)

@intelligent-222
Copy link
Author

Sure, I will test and let you know the results. thank you again for all your time and consideration. I still believe that the problem is the issue that windows and notebook has with multiprocessing package. my jupyter notebook version is 6.1.3.

I ran it through my command line using python and same error appeared again. (I used the last version which implemented generators.
command

@intelligent-222
Copy link
Author

to avoid any confusion, I am using the following code

import os
from eppy.modeleditor import IDF
from eppy.runner.run_functions import runIDFs

def make_options(idf):
idfversion = idf.idfobjects['version'][0].Version_Identifier.split('.')
idfversion.extend([0] * (3 - len(idfversion)))
idfversionstr = '-'.join([str(item) for item in idfversion])
fname = idf.idfname
options = {
'ep_version':idfversionstr,
'output_prefix':os.path.basename(fname).split('.')[0],
'output_suffix':'C',
'output_directory':os.path.dirname(fname),
'readvars':True,
'expandobjects':True
}
return options

def main():
iddfile = "/EnergyPlusV9-2-0/Energy+.idd"
IDF.setiddname(iddfile)
epwfile = "/Users/EPPY/GA/CAN_PQ_Montreal.Intl.AP.716270_CWEC.epw"
# File is from the Examples Folder
idfname1 = "/Users/EPPY/GA/practice1.idf"
# copy of previous file
idfname2 = "/Users/EPPY/GA/practice1.idf"
fnames = [idfname1, idfname1]
idfs = (IDF(fname, epwfile) for fname in fnames)
runs = ((idf, make_options(idf) ) for idf in idfs)
num_CPUs = 2
runIDFs(runs, num_CPUs)

if name == 'main':
main()

@intelligent-222
Copy link
Author

It is probably a big request, I think if I can or you could help me to use multiprocess.Pool instead of multiprocessing.Pool in the runIDFs or multirunner functions everything would be fine.

as you mentioned I am going to send my notebook (with generator and without) to my friend and ask him to run the code as well.

@intelligent-222
Copy link
Author

Here is my friend's results and the same error:
code
error1
error2

@santoshphilip
Copy link
Owner

OK. I see why it is failing

# File is from the Examples Folder
idfname1 = "/Users/EPPY/GA/practice1.idf"
# copy of previous file
idfname2 = "/Users/EPPY/GA/practice1.idf"

You are running the same file practice1.idf for idfname1 and idfname2
E+ won't let you do that

use a different file for idfname2

@intelligent-222
Copy link
Author

Thank you. I just changed the name to Practice 2 and it works. As I mentioned in my first post, the whole purpose of using parallel run for my project is to conduct a a loop of 100 runs with multiprocessing. since I use some random components in my idf, I would like to run it for 100 times and check the mean of a few output variables.

MyObjectList=[]

for i in range (100):
idf.run(expandobjects=True,readvars=True)
Data=pd.read_csv('eplusout.csv')
MyTargerVar=Data['My Target Var Column Name'].sum()
MyObjectList.append(MyTargerVar)

the code is applicable, I just have to make list of 100 idf (same idf with different name) and write a loop to open each csv and read the variable and make MyObjectiveList. I was wondering if I could run the whole loop in parallel.
Also, it the above mentioned example of multirunner, I just made a list of 8 components (IDFS=[idf]*8) with same idf without changing name and feed it into the function.

By the way, thank you for all your help and support. I greatly appreciate that.

@santoshphilip
Copy link
Owner

I consider this issue as fully resolved.
In effect eppy is able do the E+ simulation by multiprocessing on multiple processers. The Sample code above illustrates how it is done.

The rest of my comments are on what you are going to do with this functionality.

You would like to do the following:

  • Run multiple simulations
  • Gather output data from the simulations
  • Post process the output data from all simulation

If you are using idf.run() you can do this in the following manners

results = []

idf1.run()
result = getresults(idf1)
results.append(result)

idf2.run()
result = getresults(idf2)
results.append(result)

idf3.run()
result = getresults(idf3)
results.append(result)

finalresult = postprocess(results)

You cannot do this in this manner if you are multiprocessing. Your code looks like this:

runs = ((idf, make_options(idf) ) for idf in idfs)
num_CPUs = 6
runIDFs(runs, num_CPUs)
  • runIDFs can run only an E+ simulation.
  • It cannot run your code.
    • it cannot even tell you when it is finished
    • Basically at this point you have no control. The multiple processes run away and do their thing
  • You have to wait for all the simulations to complete
    • go have a coffee, drink a beer, take a nap etc.
  • run another script that will gather all the outputs and post-process

If you use zeppy https://github.com/pyenergyplus/zeppy, you can do the whole thing with one script.

  • write a function that will simulate an IDF file and then collect the output and return it
  • Zeppy will send the results from each simulation to collection point called the sink
  • The sink knows when all the simulations are completed. Now it can postprocess the results.

I understand that you are not able to get zeppy to run.
Open and issue in the zeppy repository and we can take a look

@intelligent-222
Copy link
Author

Thank you very much. Based on your informative explanation, I believe the issue is closed and solved. I will open new issue for zeppy since I found that package very useful as well. Also, I still believe it would be great if you can add csv file to the output of abovementioned multirunner function. Thank you for all your time and help.

@santoshphilip
Copy link
Owner

I would appreciate your help in getting this working in zeppy.
What you are trying to do is the perfect use case for zeppy. I would like to use it as a real example in the documentation (We can modify so that we don't use your actual file). Right now the example in zippy is a toy example.
It will also be a good proof of concept - that is works for a real problem that is not trivial

@intelligent-222
Copy link
Author

intelligent-222 commented Aug 26, 2020

That would be great. I will provide you with my idf and objective function as well as the process, we can go through and run it with zeppy that I strongly believe would help not only me but also other project similar to my project. right now I completed my code to create copies of idf and then feed them to the multiporcess run. Also, finished the part related to reading multiple csv and extract objective variable that I want to use in ga package. it seems that using main() function as objective function in ga is not possible. I am working on this next couple of days and then will start implementing zeppy .

I am leaving my code here if anyone is curious to know how to implement multiprocess for one or multiple idfs. The main(X) function is working fine. Unfortunately, when I use it as objective function for ga (as follow) I get strange errors: Contents of EnergyPlus error file at C:\Users\eplusout.err

"""multiprocessing runs"""

import os
from eppy.modeleditor import IDF
from eppy.runner.run_functions import runIDFs

def make_options(idf):
idfversion = idf.idfobjects['version'][0].Version_Identifier.split('.')
idfversion.extend([0] * (3 - len(idfversion)))
idfversionstr = '-'.join([str(item) for item in idfversion])
fname = idf.idfname
options = {
'ep_version':idfversionstr,
'output_prefix':os.path.basename(fname).split('.')[0],
'output_suffix':'C',
'output_directory':os.path.dirname(fname),
'readvars':True,
'expandobjects':True
}
return options

def main(X):
iddfile = "/EnergyPlusV9-2-0/Energy+.idd"
IDF.setiddname(iddfile)
epwfile = "/Users/CAN_PQ_Montreal.Intl.AP.716270_CWEC.epw"
w = X
fname1 = "/Users/practice1.idf"
epwfile = "/Users/CAN_PQ_Montreal.Intl.AP.716270_CWEC.epw"
idf = IDF(fname1,epwfile)
model = idf.idfobjects['EnergyManagementsystem:program'][1]
model.Program_Line_7 ="set x=" + str(w)
fnames=[]
for i in range (1,5):
idf.saveas('/Users/practice%d.idf'%(i))
fnames.append('/Users/practice%d.idf'%(i))
idfs = (IDF(fname, epwfile) for fname in fnames)
runs = ((idf, make_options(idf) ) for idf in idfs)
num_CPUs = 4
runIDFs(runs, num_CPUs)
TRELC=[]

for i in range (1,5):
    Data=pd.read_csv('practice%d.csv'%(i))
    ELC=Data['Light'].sum()
    TRELC.append(ELC)
return np.sum(TRELC)

The ga code is like this:
varbound=np.array([[1,3]])

model=ga(function=main,dimension=1,variable_type='int',variable_boundaries=varbound, function_timeout=20000,
algorithm_parameters=algorithm_param)
model.run()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants