Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Installation of neuron7.7 on PizDaint, Jureca and Galileo #525

Closed
3 tasks done
clupascu opened this issue Feb 13, 2020 · 31 comments
Closed
3 tasks done

Installation of neuron7.7 on PizDaint, Jureca and Galileo #525

clupascu opened this issue Feb 13, 2020 · 31 comments
Assignees
Milestone

Comments

@clupascu
Copy link
Collaborator

clupascu commented Feb 13, 2020

As Synaptic fitting usecases must be updated to Python3 I need to know if it would be possible to install neuron7.7 (that works with Python3) on PizDaint, Jureca and Galileo. Please let me know how I can use the new modules. Thank you.

  • neuron 7.7 working on PizDaint
  • neuron 7.7 working on Jureca
  • neuron 7.7 working on Galileo

Related task: #533

@clupascu clupascu added this to the M24: March 2020 milestone Feb 13, 2020
@pramodk
Copy link
Collaborator

pramodk commented Feb 18, 2020

Order of deployment following but make sure the existing module names remain intact (e.g. one which are already with Python2)

  • Jureca
  • Galileo
  • Piz-Daint

@pramodk
Copy link
Collaborator

pramodk commented Feb 25, 2020

@jorblancoa : this ticket requires python3 neuron module

@jorblancoa
Copy link
Collaborator

Hi @clupascu

We have installed the new modules in Jureca and Piz-Daint.
Could you please test them and let us know if you encounter any problem?

Jureca

  • For Booster:
module --force purge all
module use /usr/local/software/jurecabooster/OtherStages
module load Architecture/KNL
module load Stages/2019a
module load Intel ParaStationMPI/5.2.2-1-mt imkl
module load HDF5 Boost Python/3.6.8

module use /p/project/cvsk25/software-deployment/HBP/jureca-booster/26-02-2020/modules/tcl/linux-centos7-haswell

module load  neuron/7.8.0b-serial-python3

  • For Cluster:
module --force purge all
module use /usr/local/software/jureca/OtherStages
module load Architecture/Haswell
module load Stages/2019a
module load Intel ParaStationMPI/5.2.2-1-mt imkl
module load HDF5 Boost Python/3.6.8

module use /p/project/cvsk25/software-deployment/HBP/jureca-cluster/26-02-2020/modules/tcl/linux-centos7-haswell

module load  neuron/7.8.0b-serial-python3

Piz-Daint

export MODULEPATH=/apps/hbp/ich002/hbp-spack-deployments/modules:$MODULEPATH
module use /apps/hbp/ich002/hbp-spack-deployments/softwares/27-02-2020/install/modules/tcl/cray-cnl7-haswell
module swap PrgEnv-cray PrgEnv-intel
module load daint-mc
module load cray-python/3.6.5.7

module load neuron/7.8.0b/intel-serial-python3

Thanks!

@alex4200
Copy link
Contributor

alex4200 commented Mar 9, 2020

@clupascu WIll test with python3

@clupascu
Copy link
Collaborator Author

@jorblancoa what about Galileo? On Galileo there is already an installation of neuron7.7?

@jorblancoa
Copy link
Collaborator

@clupascu
@pramodk is going to take care of Galileo. But before deploying everywhere, we want to be sure if the modules are working for you.
If you could test those in Daint or Jureca, that would be great!

Thanks!

@clupascu
Copy link
Collaborator Author

@jorblancoa I tried on Jureca, but it seems that the order of the parameters it is not working anymore.

I was using this before

srun ./x86_64/special 'config-exp1.txt' 'exp1.txt' 'ProbGABAAB_EMS_GEPH_g.mod' False True False 3 -mpi -python fitting.py

Any suggestion?

@jorblancoa
Copy link
Collaborator

Could you post the error message or log file so I can have a look at the issue?

@clupascu
Copy link
Collaborator Author

Traceback (most recent call last):
File "fitting.py", line 251, in
fitting(sys.argv[1],sys.argv[2],sys.argv[3],sys.argv[4],sys.argv[5],sys.argv[6],sys.argv[7])
File "fitting.py", line 53, in fitting
singletrace_number = int(singletrace_number)
ValueError: invalid literal for int() with base 10: 'True'

This is the error message. Like the arg 'True' is read instead of the arg 3.

@clupascu
Copy link
Collaborator Author

The folder I am working in Jureca is /p/home/jusers/lupascu1/jureca/testneuron+python3/.
You can have a look there.

@pramodk
Copy link
Collaborator

pramodk commented Mar 16, 2020

@clupascu : I am pretty sure NEURON installation is working with Python3:

[kumbhar1@jrl05 ~]$ nrniv -python
NEURON -- VERSION 7.8.0-2-g92a208b+ HEAD (92a208b+) 2019-10-29
Duke, Yale, and the BlueBrain Project -- Copyright 1984-2018
See http://neuron.yale.edu/neuron/credits

loading membrane mechanisms from x86_64/.libs/libnrnmech.so
Additional mechanisms from files

>>>
>>> print "a"
  File "stdin", line 1
    print "a"
            ^
SyntaxError: Missing parentheses in call to 'print'. Did you mean print("a")?
>>> print ("A")
A
>>>

We are not familiar with the fitting.py but I was adding:

    singletrace_number = int(singletrace_number)
    print("----->", singletrace_number)

and I was testing as:

$ srun -p develbooster -A vsk25 -n 1 nrniv 'config-exp1.txt' 'exp1.txt' 'ProbGABAAB_EMS_GEPH_g.mod' False True False 3 -mpi -python fitting.py
srun: job 8109728 queued and waiting for resources
srun: job 8109728 has been allocated resources
NEURON -- VERSION 7.8.0-2-g92a208b+ HEAD (92a208b+) 2019-10-29
Duke, Yale, and the BlueBrain Project -- Copyright 1984-2018
See http://neuron.yale.edu/neuron/credits

Additional mechanisms from files
 netstims.mod ProbGABAAB_EMS_GEPH_g.mod
-----> 3

So I think there is something in your environment or script?

@clupascu
Copy link
Collaborator Author

I moved a little bit on. I replaced as suggested

srun ./x86_64/special 'config-exp1.txt' 'exp1.txt' 'ProbGABAAB_EMS_GEPH_g.mod' False True False 3 -mpi -python fitting.py

with

srun nrniv 'config-exp1.txt' 'exp1.txt' 'ProbGABAAB_EMS_GEPH_g.mod' False True False 3 -mpi -python fitting.py

and now I get in the output file this warning

Warning: detected user attempt to enable MPI, but MPI support was disabled at build time.

and in the error file

srun: error: jrc6369: tasks 24-31,33-47: Segmentation fault
srun: error: jrc6369: task 32: Terminated
srun: error: jrc6368: tasks 0-3,6-7,9-10,12,18,20-21,23: Terminated
srun: error: jrc6368: tasks 4-5,8,11,13-17,19,22: Segmentation fault
srun: Force Terminated job step 8109797.0

@pramodk
Copy link
Collaborator

pramodk commented Mar 16, 2020

@clupascu : I didn't mean to replace "srun ./x86_64/special" with "srun nrniv". I was just trying to run your example to see if NEURON works.

You still need to use "./x86_64/special" in order to get your local mod files included. nrniv just includes neuron's default mod files.

@clupascu
Copy link
Collaborator Author

With "srun nrniv" I don't have the ValueError: invalid literal for int() with base 10: 'True' error anymore.

@clupascu
Copy link
Collaborator Author

@pramodk have you got some time to look into this issue?

@clupascu
Copy link
Collaborator Author

I have the same error also on PizDaint.

@pramodk
Copy link
Collaborator

pramodk commented Mar 25, 2020

@clupascu : Haven't looked into this yet. Can you point out directory on Piz-Daint? (in scratch or non-home directory where I can access).

@clupascu
Copy link
Collaborator Author

@pramodk you can find the directory on Piz-Daint here /scratch/snx3000/bp000028/testneuron+python3

@pramodk
Copy link
Collaborator

pramodk commented Mar 25, 2020

@clupascu : I was looking at this with @jorblancoa and we couldn't find easily the issue. This is what we did:

  • Let's start with simple example of Python3 and neuron using pc.runworker:
from neuron import h
pc = h.ParallelContext()

def f(arg):
   id = int(pc.id())
   nhost = int(pc.nhost())
   print ('I am %d of %d'%(id, nhost))
   return arg*arg

pc.runworker()

s = 0
if pc.nhost() == 1:
   for i in range(1, 21):
      s += f(i)
else:
   for i in range(1, 21):
      pc.submit(f, i)
   while pc.working():
      s += pc.pyret()
print (s)

pc.done()
h.quit()

Running with NEURON and Python 3 just works fine:

module use /apps/hbp/ich002/hbp-spack-deployments/softwares/27-02-2020/install/modules/tcl/cray-cnl7-haswell
module load neuron/7.8.0b/intel-python3

bp000174@daint103:~/NRN-37> srun nrniv -python test_worker.py  -mpi
NEURON -- VERSION 7.8.0-2-g92a208b6+ HEAD (92a208b6+) 2019-10-29
Duke, Yale, and the BlueBrain Project -- Copyright 1984-2018
See http://neuron.yale.edu/neuron/credits

2870.0
I am 2 of 8
I am 2 of 8
I am 2 of 8
I am 2 of 8
I am 3 of 8
I am 3 of 8
I am 3 of 8
I am 3 of 8
I am 4 of 8
I am 4 of 8
I am 4 of 8
I am 5 of 8
I am 5 of 8
I am 6 of 8
I am 7 of 8
I am 1 of 8
I am 1 of 8
I am 1 of 8
I am 1 of 8
I am 1 of 8
numprocs=8

So we believe that there is no issue with neuron and python3 installation itself.

  • It's true that while using your example we were getting errors like below with Python3 but not with Python2:
srun: error: nid00008: tasks 0,7: Segmentation fault (core dumped)

I added import fitness at top of fitting.py file:

import random
import csv
import math
....
import subprocess
import fitness

Which runs program bit further but still see segfault error later during the execution. Could you check by adding import fitness near top?

I suspect the issue exist in the script somewhere (?) that is only becoming visible with Python3.

Do you have neuron installed on your desktop where you can test this? Otherwise, as I am not entirely familiar with the code, I think one needs to test part of the code and see which functions is causing the segfault error.

We can discuss this further tomorrow if required.

@clupascu
Copy link
Collaborator Author

clupascu commented Mar 27, 2020

@pramodk Can you please let me know what was the parallel version to use instead of module load neuron/7.8.0b-serial-python3 on Jureca? It seems that if I use this parameter order

srun nrniv -python 'configA1.txt' 'expA1.txt' 'ProbGABAAB_EMS_GEPH_g.mod' False True False 3 fitting.py -mpi

Neuron is loaded (but is loaded of course several times).

@jorblancoa
Copy link
Collaborator

Hi @clupascu
If you load the latest modules (25/03) you can find a parallel neuron.

module use /p/project/cvsk25/software-deployment/HBP/jureca-booster/25-03-2020/modules/tcl/linux-centos7-haswell
module load neuron/7.8.0b

Let me know if you have any problems.

@clupascu
Copy link
Collaborator Author

@pramodk and @jorblancoa I modified my code (the segmentation fault was due to one function not available anymore in python3) and my code works perfectly now with the new modules from issue #533 on Jureca booster and PizDaint. Any news on the installation of the same modules on Galileo?

@jorblancoa
Copy link
Collaborator

Hi @clupascu
Out of curiosity, what was the function not available in python3 causing the core dump?
Regarding the deployment in Galileo, we are finishing latest validations in Daint and Jureca, and once everything is properly tested, we will deploy in Galileo.

@clupascu
Copy link
Collaborator Author

@jorblancoa the function not available in python3 causing the core dump was file.

@alex4200
Copy link
Contributor

alex4200 commented Apr 6, 2020

Hi, can you please tick the boxes in the top-most main comment to mark the cases that are done?

@pramodk
Copy link
Collaborator

pramodk commented Apr 7, 2020

As far as I know, all systems are up to date with neuron and python3. See #533.

@clupascu
Copy link
Collaborator Author

clupascu commented Apr 7, 2020

I tested neuron7.7 on PizDaint, Jureca and Galileo and everything works perfectly. I am closing this issue.

@clupascu clupascu closed this as completed Apr 7, 2020
@mmigliore
Copy link
Collaborator

mmigliore commented Apr 9, 2020 via email

@pramodk
Copy link
Collaborator

pramodk commented Apr 9, 2020

@mmigliore : Has user bp000338 used the modules in the past? i.e. can he access to/apps/hbp/ich002?

$  ls -l /apps/hbp/ich002/

Note that the we can change permissions to /apps/hbp/ich002/hbp-spack-deployments but not top level /apps/hbp/ich002/.

@mmigliore
Copy link
Collaborator

mmigliore commented Apr 10, 2020 via email

@pramodk
Copy link
Collaborator

pramodk commented Apr 11, 2020

i.e. can he access to/apps/hbp/ich002?

No. This is going to be a general problem for many users not directly
related to HBP.

Ok, this is something new then.

  • Which directory is accessible for all HBP as well as non-HBP users? This is something needs to be checked with CSCS?
  • Is only neuron, bluepyopt module needs to be available for non-HBP users? or, also neurodamus-hippocampus?
  • Is this the case also on other systems? (Juelich and Cineca)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants