Skip to content

Running on Solstorm (cluster)

Asbjørn Djupdal edited this page Oct 5, 2022 · 23 revisions

Setup

You should login to solstorm-login.iot.ntnu.no. The following needs (in principal) only be executed once:

  1. Put the following in your .bashrc on solstorm:

module load Python/3.10.4-GCCcore-11.3.0 (or other python3 version)

  1. Install the necessary python packages on solstorm:

pip3 install shapely numpy progress matplotlib tensorflow geopy pandas sklearn gurobipy jsonpickle bigquery

  1. Clone the gitrepository on solstorm, and place the fomo-directory on /storage/users/$USER/

Scripts

The cluster support is built with four scripts:

  • create_runs.py: Creates the jobs to be distributed to the cluster. Modify and run this script to generate the cluster jobs for your experiment.
  • run_all.sh: Distributes all jobs created with create_jobs.py and runs them on the individual nodes.
  • run.py: This script runs on all the individual compute nodes. It is executed automatically by run_all.sh. The script stores data from the run in a file called output.csv. Modify this script to choose which data to store.
  • visualise_runs.py: Example of how to visualize the results from the cluster run.

Usage

  1. Modify create_runs.py with the requirements of your experiment
  2. On solstorm.iot.ntnu.no:
    1. Prework:
      • $cd /storage/users/USER/fomo
      • Make sure you initialize git. E.g.: $ git init and $ git pull
      • Potentially do a screen command: $ screen
    2. Create the jobs like this: $ python3 create_runs.py
    3. Distribute and run all jobs like this: $ ./run_all.sh -r <row> -n <nodenum1,nodenum2,...,nodenumN>
    4. The results are stored in output.csv
  3. Optionally visualize the results like this (on your PC): $ python3 visualise_runs.py
Clone this wiki locally