From 65e60db0a55480e611f42b033289550668b34d6f Mon Sep 17 00:00:00 2001
From: mbianchetti <bianchetti.ml@gmail.com>
Date: Thu, 11 Mar 2021 19:03:59 -0300
Subject: [PATCH] Version that allows to pass arguments to the script; i.e.,
 the directory names, the seed, the max slot available per arc and the
 percentage of the slots that a demand can use. Reordering the output
 directories dividing by percentage first instead of topology

---
 README.md              |  57 ++++++++++++++---
 instances_generator.py | 137 +++++++++++++++++++++++++++++++----------
 requirements.txt       |   5 +-
 3 files changed, 153 insertions(+), 46 deletions(-)
diff --git a/README.md b/README.md
index a360fda..1081db5 100644
--- a/README.md
+++ b/README.md
@@ -19,20 +19,20 @@ From the literature we obtained the most used data for the RSA problem:
 
   - The bandwidth of the optical fiber used on average is 4800 GHz, although the theoretical maximum bandwidth of the optical fiber is around 231 THz.
 
-We use that information to define the available number of slots per link and the maximal amount of slots used by each demand. This is stored in the variables:
+We use that information to define the default available number of slots per link the maximal amount of slots used by each demand and the amount of demands.
+    
+To define the number of demands of each instance, we estimate an upper limit using the density of the graph, the number of slots per link and the average of slots per demand trying to make the majority of instances feasible, but not all. The amount of demands will be a number between that and its half.
 
 ```
-avaliable_S   &   max_slots_by_demand
+max_number_of_demands = (n - 1) * d * S / (max_slots_per_demand / 2)
 ```
-    
-To define the number of demands of each instance, we estimate an upper limit using the density of the graph, the number of slots per link and the average of slots per demand trying to make the majority of instances feasible, but not all.
 
 ## Files
 The main scrip called **instances_generator.py** reads the topologies stored in the __topologies/__ directory and generates a serie of instances files for the RSA for each topology based on the data of the literature. It depends on the modulation level the optical fiber used, the graph density among other paramters.
 
 ## Data Format
 The format of the data is commented in the header of each file. The separator used is the tabulation, and the line starting with # is a comment.
-
+### Topology
 The topology file is preceded by the number of nodes and the number of edges followed by the list of these one by line.
 ```
 # Comment
@@ -41,14 +41,15 @@ The topology file is preceded by the number of nodes and the number of edges fol
 <node k>    <node l>
 ...
 ```
-Most of the instances belong to capacitated networks and we got the data. In those cases the weight of each edge is added when is defined. 
+Most of the instances belong to capacitated networks and we were able to obtain that data. In those cases, the weight of each edge is added when it is defined.
 ```
 <node i>    <node j>    <weight ij>
 ```
 
+### Generated Instance
 As well as the problem is stated over directed graphs and due to the way in which the networks are made we asume all the links have both senses.
 
-The instance file also starts with a header explaining the format briefly. Then it has the amount of slots available for each edge and the number of the requested demands.
+The instance file also begins with a header that briefly explains the format. The version of this software and the used seed are shown there. The number of slots available for each edge and the number of demands requested are shown below followed by the list of demands.
 ```
 # Comment
 S     |D|
@@ -58,11 +59,49 @@ S     |D|
 ```
 
 ## Usage
-  
+### Requirements
+The requirements are stored in `requirements.txt`. They can be installed via
+
+```
+pip install -r requirements.txt
+```
+
+## Run
 To generate the instances just run the script:
 
 ```
 python instances_generator.py
 ```
 
-The instances are going to be placed on a new folder called instances into the main directory. Each one of those files with its asociated topology are the input for the RSA problem.
\ No newline at end of file
+By default the instances are going to be placed on a new folder called instances into the directory of the script into subfolders according to the maximum percent of slots used by demands and the topology. Each one of those files with its asociated topology are the input for the RSA problem.
+
+Run the following to see how to configure the parameters:
+
+```
+python instances_generator.py -h
+
+optional arguments:
+  -h, --help            show this help message and exit
+  -mdir MDIR            The main directory or path. If no tdir or idir parameters are used, mdir must contain the 'topologies' and/or 'instances' folder. The default
+                        value is the location of this script
+  -tdir TDIR            The topologies directory or path.
+  -idir IDIR            The directory or path for the created instances.
+  -s SEED, --seed SEED  The random seed. Default is 1988.
+  -S SLOTS [SLOTS ...], --slots SLOTS [SLOTS ...]
+                        List of amounts of available slots.
+  -p PERCENTS [PERCENTS ...], --percents PERCENTS [PERCENTS ...]
+                        List of maximum percentage of total available slots that a demand can use. Must be in (0, 1].
+```
+
+### Example
+The following line will generate instances for S in [10, 25] and p in [10%, 20%, 30%, 50%]. They will be saved in `./instances/`.
+
+```
+python instances_generator.py -S 10 25 -p .1 .3 .5 .2
+```
+
+This one will generate instances with S in [10, 15, 20, 30, 40, 60, 80, 100, 150, 200, 300, 400, 600, 800, 1000] and p in [10%, 20%, 30%, ..., 90%] in the directory `/home/instances`.
+
+```
+python instances_generator.py -idir /home/instances
+```
\ No newline at end of file
diff --git a/instances_generator.py b/instances_generator.py
index c4fe2d3..fa568f5 100644
--- a/instances_generator.py
+++ b/instances_generator.py
@@ -2,7 +2,7 @@
 
 '''
     This script reads the topologies stored in the topologies directory
-    and generates a serie of instances files for the RSA for each topology 
+    and generates a serie of instances files for the RSA for each topology
     based on the data of the bibliography. It depends on the modulation level
     the optical fiber used, the graph density among other paramters.
 
@@ -10,7 +10,7 @@
 
     + The slot bandwith most of the cases is 12.5 GHz. However it could be
     smaller (sometimes 5 GHz) or larger, but not much larger, due to the fact
-    that for the RWA problem with WDM, the minimum bandwidth of the slot is 
+    that for the RWA problem with WDM, the minimum bandwidth of the slot is
     50 GHz and the main objective of RSA is to improve granularity.
 
     + The bandwidth of the optical fiber used on average is 4800 GHz,
@@ -20,10 +20,13 @@
 
 import os
 import random
+import argparse
+import numpy as np
+import math
 
 __author__ = "Marcelo Bianchetti"
 __credits__ = ["Marcelo Bianchetti"]
-__version__ = "1.0.0"
+__version__ = "1.2.0"
 __maintainer__ = "Marcelo Bianchetti"
 __email__ = "mbianchetti at dc.uba.ar"
 __status__ = "Production"
@@ -32,6 +35,11 @@
 sep = '\t'
 
 
+def error(err):
+    print("ERROR: {}".format(err))
+    exit(1)
+
+
 def calculateGraphDensity(n, m):
     ''' Returns the density of the undirected graph '''
     return 2.*m/(n*(n-1))
@@ -39,15 +47,15 @@ def calculateGraphDensity(n, m):
 
 def calculateMaxNumberOfDemands(n, m, S, max_sd):
     ''' Given a graph and the amount of slots per link
-        it returns an estimative of the max amount of 
-        demands per instance. 
+        it returns an estimative of the max amount of
+        demands per instance.
 
         n: number of nodes
         m: number of undirected edges
         S: chosen number of slots per arc
         max_sd: chosen max value for slots by demand
 
-        A tighter bound could be the min grade of the 
+        A tighter bound could be the min grade of the
         nodes but we want infeasible instances too.
     '''
     d = calculateGraphDensity(n, m)
@@ -58,61 +66,124 @@ def calculateMaxNumberOfDemands(n, m, S, max_sd):
 def readTopologyData(tops_dir, top_fname):
     ''' Returns the amount of nodes and edges of the graph '''
     with open(os.path.join(tops_dir, top_fname)) as f:
-        for l in f:
-            if l.startswith('#'):
+        for line in f:
+            if line.startswith('#'):
                 continue
-            l = l.split()
-            return int(l[0]), int(l[1])
+            line = line.split()
+            return int(line[0]), int(line[1])
 
 
 if __name__ == "__main__":
-    main_dir = os.path.dirname(os.path.abspath(__file__))
-    topologies_dir = os.path.join(main_dir, 'topologies')
-    instances_dir = os.path.join(main_dir, 'instances')
+
+    parser = argparse.ArgumentParser(description=__doc__)
+    parser.add_argument("-mdir",
+                        type=str,
+                        help="The main directory or path. If no tdir or idir "
+                             "parameters are used, mdir must contain the "
+                             "'topologies' and/or 'instances' folder. The "
+                             "default value is the location of this script")
+    parser.add_argument("-tdir",
+                        type=str,
+                        help="The topologies directory or path.")
+    parser.add_argument("-idir",
+                        type=str,
+                        help="The directory or path for the created "
+                        "instances.")
+    parser.add_argument("-s", "--seed", type=int, default=1988,
+                        help="The random seed. Default is 1988.")
+    parser.add_argument("-S", "--slots", nargs='+', type=int,
+                        help="List of amounts of available slots.")
+    parser.add_argument("-p", "--percents", nargs='+', type=float,
+                        help="List of maximum percentage of total available "
+                             "slots that a demand can use. Must be in (0, 1].")
+    args = parser.parse_args()
+
+    main_dir = (os.path.dirname(os.path.abspath(__file__))
+                if args.mdir is None else args.mdir)
+
+    topologies_dir = (os.path.join(main_dir, 'topologies')
+                      if args.tdir is None
+                      else os.path.abspath(args.tdir))
+
+    instances_dir = (os.path.join(main_dir, 'instances')
+                     if args.idir is None
+                     else os.path.abspath(args.idir))
+
+    for d in [main_dir, topologies_dir]:
+        if not os.path.exists(d):
+            error("Directory '{}' not found.".format(d))
+
     instance_fname = 'instance_{}_{}_{}_{}.txt'
 
-    avaliable_S = [10, 15, 20, 30, 40, 60, 80,
-                   100, 150, 200, 300, 400, 600, 800, 1000]
+    random.seed(args.seed)
+
+    # Available slots per fiber
+
+    avaliable_S = ([10, 15, 20, 30, 40, 60, 80, 100, 150, 200, 300, 400,
+                    600, 800, 1000] if args.slots is None else
+                   [s for s in set(args.slots) if s > 0])
 
-    # From a low loaded network to a really high loaded one.
-    max_slots_by_demand = [4, 6, 8, 10, 15, 20, 25, 30, 50, 80]
+    # Default: From a lightly loaded network to a heavily loaded one.
+    max_percentages_of_slots_by_demand = (np.arange(.1, .9, .1)
+                                          if args.percents is None else
+                                          [p for p in set(args.percents)
+                                           if p > 0 and p <= 1])
 
+    # Creation of instances directory if it does not exist
     if not os.path.exists(instances_dir):
         os.makedirs(instances_dir)
 
-    random.seed(1988)
-    for top_fname in os.listdir(topologies_dir):
-        top_name = os.path.splitext(top_fname)[0]
-        n, m = readTopologyData(topologies_dir, top_fname)
-        instance_dir = os.path.join(instances_dir, top_name)
+    for percentage in max_percentages_of_slots_by_demand:
 
-        if not os.path.exists(instance_dir):
-            os.makedirs(instance_dir)
+        # The resulting instances are created in directories
+        # acoording to their percentage and topologies
+        percentage_dir = os.path.join(instances_dir,
+                                      "{}".format(round(percentage, 2) * 100))
 
-        for S in avaliable_S:
-            for max_sd in max_slots_by_demand:
+        # Creation of instance directory if it does not exist
+        if not os.path.exists(percentage_dir):
+            os.makedirs(percentage_dir)
 
-                if max_sd > S:
-                    break
+        for top_fname in os.listdir(topologies_dir):
+            top_name = os.path.splitext(top_fname)[0]
+            n, m = readTopologyData(topologies_dir, top_fname)
+
+            # The resulting instances are created in directories
+            # acoording to their topologies
+            top_dir = os.path.join(percentage_dir, top_name)
+
+            # Creation of instance directory if it does not exist
+            if not os.path.exists(top_dir):
+                os.makedirs(top_dir)
+
+            # Iterates over each available S
+            for S in avaliable_S:
+                max_sd = math.ceil(percentage * S)
 
                 max_nD = calculateMaxNumberOfDemands(n, m, S, max_sd)
                 nD = random.randint(int(max_nD/2), max_nD)
 
                 demand_f = os.path.join(
-                    instance_dir, instance_fname.format(top_name, S, max_sd, nD))
+                    top_dir, instance_fname.format(
+                        top_name, S, max_sd, nD))
 
                 with open(demand_f, 'w') as out:
                     out.write('# Created by {}\n'.format(__author__))
+                    out.write('# Version: {}\n'.format(__version__))
+                    out.write('# Seed: {}\n'.format(args.seed))
                     out.write('# Format:\n')
                     out.write('#   First line: S  |D|\n')
                     out.write('#   Other lines: <src\tdst\t#slots>\n')
 
-                    l = '{}{}{}'.format(S, sep, nD)
-                    out.write(line_enter.format(l))
+                    line = '{}{}{}'.format(S, sep, nD)
+                    out.write(line_enter.format(line))
 
                     for _ in range(nD):
                         src, dst = random.sample(range(n), 2)
                         s = random.randint(1, max_sd)
-                        l = '{src}{sep}{dst}{sep}{s}'.format(
+                        line = '{src}{sep}{dst}{sep}{s}'.format(
                             S=S, sep=sep, src=src, dst=dst, s=s)
-                        out.write(line_enter.format(l))
+                        out.write(line_enter.format(line))
+
+
+# git commit -m "Version that allows to pass the directory names, the seed, the max slot available per arc and the percentage of the slots that a demand can use as arguments. Reordering the output directories. "
\ No newline at end of file
diff --git a/requirements.txt b/requirements.txt
index 3e460aa..a39ae54 100644
--- a/requirements.txt
+++ b/requirements.txt
@@ -1,4 +1 @@
-matplotlib==3.1.1
-networkx==2.3
-numpy==1.17.2
-scipy==1.3.1
\ No newline at end of file
+numpy==1.20.1