-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to run model in multiprocess or parallel #300
Comments
I think I understand what you are trying to do. |
If you are online now, stay online. I might have some questions as I look thru this (I am looking thru now) |
Sure, thank you |
There may be a quick way round this. Take a look at the package zeppy https://github.com/pyenergyplus/zeppy
I wrote this at the start of the covid lockdown. See if it makes sense and check if it works for you. What time zone are in ? |
I am asking about the time zone, since I am in Pacific time and it is late here |
EST. Canada. Be honest with you I have reviewed even second page of google search to find the potential solution and I also tried zeppy. let me try it again but I believe there was an error related to identifying parallel pipe command. meanwhile, could you please let me know if it is possible to get csv file as output from multirunner command. worst case scenario, I can use that function to create 100 csv file and extract my data from those. Best, |
OK it is far later in the night for you. I need to go to bed now. This has to work |
Great, thank you and appreciate your time and consideration. I am going to give a try to zeppy again to see if it runs on my notebook. Stay safe and good night, |
Good Night. If zeppy does not work - don't try too hard. Can you send me
I can get rolling once I have my coffee in the morning. |
The zeppy documentation is made from a notebook and it worked |
my first attempt using zeppy.zmq_parallelpipe gave me the following error: and my second attempt using ppipes.ipc_parallelpipe(EPRUN, IDFS, nworkers=4) gave me |
Regarding to your request: the IDF file : how can I send it to you privately not as public post? |
see if this code works for you (it worked on my machine) You may have to change pathnames for the files
Some notes to myself:
|
Thank you Santoshphilip. I changed the path and run the file as follow. there is an error that I couldn't figure out what it that. Also, I assigned same idf to my run list instead of two different. """multiprocessing runs""" import os def make_options(idf): def main():
if name == 'main': |
Error: RemoteTraceback Traceback (most recent call last) During handling of the above exception, another exception occurred: Traceback (most recent call last): Contents of EnergyPlus error file at C:\Users\EPPY\GA\eplusout.err """ The above exception was the direct cause of the following exception: EnergyPlusRunError Traceback (most recent call last) in main() c:\users\anaconda3\envs\elective\lib\site-packages\eppy\runner\run_functions.py in runIDFs(jobs, processors) c:\users\anaconda3\envs\elective\lib\multiprocessing\pool.py in map(self, func, iterable, chunksize) c:\users\anaconda3\envs\elective\lib\multiprocessing\pool.py in get(self, timeout) EnergyPlusRunError: Contents of EnergyPlus error file at C:\Users\EPPY\GA\eplusout.err |
I believe the problem is related to the issue that windows and notebook have with multiprocessing. please refer to the following link: |
Can you test it outside of jupyter notebook ?
|
I ran it successfully in remove
and just run
|
my
|
You won't see the output in the notebook, since it is running in a separate process |
If you are running 100 files, you have to use generators
|
Also, I want to open the csv file and extract some data from each run and append them to the target list. |
while fixing this issue, I just fixed #296 |
What is curious is that my What is your install ? Right now I am feeling confident that the code is working fine. |
to avoid any confusion, I am using the following code import os def make_options(idf): def main(): if name == 'main': |
It is probably a big request, I think if I can or you could help me to use multiprocess.Pool instead of multiprocessing.Pool in the runIDFs or multirunner functions everything would be fine. as you mentioned I am going to send my notebook (with generator and without) to my friend and ask him to run the code as well. |
OK. I see why it is failing
You are running the same file use a different file for |
Thank you. I just changed the name to Practice 2 and it works. As I mentioned in my first post, the whole purpose of using parallel run for my project is to conduct a a loop of 100 runs with multiprocessing. since I use some random components in my idf, I would like to run it for 100 times and check the mean of a few output variables. MyObjectList=[] for i in range (100): the code is applicable, I just have to make list of 100 idf (same idf with different name) and write a loop to open each csv and read the variable and make MyObjectiveList. I was wondering if I could run the whole loop in parallel. By the way, thank you for all your help and support. I greatly appreciate that. |
I consider this issue as fully resolved. The rest of my comments are on what you are going to do with this functionality. You would like to do the following:
If you are using idf.run() you can do this in the following manners
You cannot do this in this manner if you are multiprocessing. Your code looks like this:
If you use zeppy https://github.com/pyenergyplus/zeppy, you can do the whole thing with one script.
I understand that you are not able to get zeppy to run. |
Thank you very much. Based on your informative explanation, I believe the issue is closed and solved. I will open new issue for zeppy since I found that package very useful as well. Also, I still believe it would be great if you can add csv file to the output of abovementioned multirunner function. Thank you for all your time and help. |
I would appreciate your help in getting this working in |
That would be great. I will provide you with my idf and objective function as well as the process, we can go through and run it with zeppy that I strongly believe would help not only me but also other project similar to my project. right now I completed my code to create copies of idf and then feed them to the multiporcess run. Also, finished the part related to reading multiple csv and extract objective variable that I want to use in ga package. it seems that using main() function as objective function in ga is not possible. I am working on this next couple of days and then will start implementing zeppy . I am leaving my code here if anyone is curious to know how to implement multiprocess for one or multiple idfs. The main(X) function is working fine. Unfortunately, when I use it as objective function for ga (as follow) I get strange errors: Contents of EnergyPlus error file at C:\Users\eplusout.err """multiprocessing runs""" import os def make_options(idf): def main(X):
The ga code is like this: model=ga(function=main,dimension=1,variable_type='int',variable_boundaries=varbound, function_timeout=20000, |
Hi,
I have problem with running EPPY and multiprocess.Pool in parallel. I have an objective function that I want to minimize and I need to run an idf for 100 times to find a mean that I will use in my objective function. My problem is, when I tried multiprocess. Pool the code gave me an error related to ep_version and when I append ep_version to my (idf, epw) list I got error that list objective does not have run function.
I went through the “def test_multiprocess_run(self):” function and I have replicated it with my idf and it works properly. However, it didn’t provide csv file in output folder. Also, it provides me with multiple folder for each run that I don’t need them since in each run I want to read a variable and store it in predefined list. Could you please help me to figure out this problem. I have been looking for the solution past 5 days and I have tried every single potential solution.
My objective function is:
def f(X):
w = X[0]
fname1 = "/MyidfFile.idf"
epwfile = "/MyepwFile.epw"
idf = IDF(fname1,epwfile)
“
Here I have a few line of codes that make some changes in my idf and save as it as new idf
“
idf.saveas("/MyidfFile.idf")
fname1 = "/MyidfFile.idf"
epwfile = "/MyepwFile.epw"
idf = IDF(fname1,epwfile)
I want to run the loop code in parallel by implementing multiprocess.Pool or any parallel run. So, I separate the loop part as a new function and tried to call it in my objective function using Pool, or parallel run packages.
def EPRUN(I):
idf = I
idf.run(expandobjects=True,readvars=True)
Data=pd.read_csv('eplusout.csv')
MyTargerVar=Data['My Target Var Column Name'].sum()
return MyTargerVar
At first attempt, I created a list IDFS with 100 elements all equal to my “idf = IDF(fname1,epwfile)”used the following package:
num_cores = multiprocessing.cpu_count()
TRELC = Parallel(n_jobs=num_cores)(delayed(EPRUN)(IDFS) for IDF in IDFS)
Error : ep_version should be define
So, I have reviewed Sntshphilp post related to test runner and used the same approach to define my list of IDFs:
IDFS = []
ep_version = '-'.join(str(x) for x in modeleditor.IDF.idd_version[:3])
VERSION = '9-2-0'
assert ep_version == VERSION
for i in range(100):
kwargs = {'output_directory': 'results_%s' % i,
'ep_version': ep_version}
IDFS.append([[fname1, epwfile], kwargs])
I have tested it as separate function without EPRUN and by using multirunner (similar to the test_runner code) and it works but it gave me 100 folders of results of each run without csv file that I have needed to extract my information related to my target variable. And, when I used it to run my EPRUN function it gave me an error that the “run” function has not been defined for object which I assume is related to the new list “IDFS” that I have defined.
I went through the document section 6.1 “Running in parallel processes” and it mentioned something related to parallel run and it seems the should be an example that I couldn’t find there.
“You first need to create your jobs as a list of lists in the form:
[[, ], ...]
The example here just creates 4 identical jobs apart from the output_directory the results are saved in, but you would obviously want to make each job different.
Then run the jobs on the required number of CPUs using runIDFs. . .”
I have tried to implement “runIDFs” and it doesn’t work as well.
I am pretty sure that there is a trick here that I have missed and I haven’t found it so far. Could anyone please help me to solve this issue. Any help would be greatly appreciated.
In summary, I need to run my EPRUN function for 100 times with one idf in parallel or multiprocess and pass the results to my objective function for further analysis.
The text was updated successfully, but these errors were encountered: