From 63a887d7e9a09baef8c9e675d83e6706013e0b74 Mon Sep 17 00:00:00 2001
From: amyheather <a.heather2@exeter.ac.uk>
Date: Tue, 11 Jun 2024 14:11:36 +0100
Subject: [PATCH] Updates to README, reproduction and report

---
 evaluation/reproduction_report.qmd |  12 +---
 reproduction/README.md             | 102 +++++++++++++++++++++--------
 reproduction/reproduction.ipynb    |  46 +++++++++++--
 3 files changed, 118 insertions(+), 42 deletions(-)
diff --git a/evaluation/reproduction_report.qmd b/evaluation/reproduction_report.qmd
index 9f20bc7..a9c7066 100644
--- a/evaluation/reproduction_report.qmd
+++ b/evaluation/reproduction_report.qmd
@@ -1,12 +1,6 @@
 ---
 title: "Summary report"
 subtitle: "For computational reproducibility assessment of Allen et al. 2020"
-authors:
-  - name: Amy Heather
-    url: https://github.com/amyheather
-    orcid: 0000-0002-6596-3479
-date: 06/07/2024
-date-format: "Do MMMM YYYY"
 format:
   html:
     page-layout: full
@@ -15,10 +9,10 @@ echo: False
 
 ## Study
 
-This was a discrete-event simulation modeling patient allocation to dialysis units during the pandemic. It required that COVID positive patients were kept seperate from COVID negative patients (with particular units designated for COVID positive cases). It modelled a worst-case scenario of COVID spread over 150 days.
-
 Allen, M., Bhanji, A., Willemsen, J., Dudfield, S., Logan, S., & Monks, T. A simulation modelling toolkit for organising outpatient dialysis services during the COVID-19 pandemic. *PLoS One* 15, 8 (2020). <https://doi.org/10.1371%2Fjournal.pone.0237628>.
 
+This is a discrete-event simulation modeling patient allocation to dialysis units during the pandemic. The patients need to be transported to a dialysis unit several times a week but, during the pandemic, it was required that COVID positive patients were kept seperate from COVID negative patients. The proposed plan was that all COVID negative units were sent to a particular unit (with a second overflow unit). This study tests that plan with a worst-case scenario of COVID spread over 150 days.
+
 ## Computational reproducibility
 
 Successfully reproduced **all 3 figures (100%)** within scope in **7h 6m (17.8%)**.
@@ -125,4 +119,4 @@ fig.for_each_trace(lambda t: t.update(name = newnames[t.name]))
 fig.show(config={'displayModeBar': False})
 ```
 
-<sup>The original study **repository** was evaluated against guidance for **sharing of artefacts** (best practice audit and STARS) and against criteria from journal **badges** relating to how open and reproducible the model is. The original study **article** and supplementary materials (excluding code) were evaluated against **reporting guidelines** for DES models: STRESS-DES, and guidelines adapted from ISPOR-SDM.</sup>
+<sup>Context: The original study repository was evaluated against guidance for sharing of artefacts (best practice audit and STARS) and against criteria from journal badges relating to how open and reproducible the model is. The original study article and supplementary materials (excluding code) were evaluated against reporting guidelines for DES models: STRESS-DES, and guidelines adapted from ISPOR-SDM.</sup>
diff --git a/reproduction/README.md b/reproduction/README.md
index 4b38f3c..a39f323 100644
--- a/reproduction/README.md
+++ b/reproduction/README.md
@@ -1,51 +1,95 @@
 # COVID-19 Dialysis Service Delivery Model
 
-**Article:** Allen, M., Bhanji, A., Willemsen, J., Dudfield, S., Logan, S., & Monks, T. **A simulation modelling toolkit for organising outpatient dialysis services during the COVID-19 pandemic**. *PLoS One* 15, 8 (2020). <https://doi.org/10.1371%2Fjournal.pone.0237628>.
+## Model summary
 
-**Source code:** <https://zenodo.org/records/3760626>.
+Allen, M., Bhanji, A., Willemsen, J., Dudfield, S., Logan, S., & Monks, T. A simulation modelling toolkit for organising outpatient dialysis services during the COVID-19 pandemic. *PLoS One* 15, 8 (2020). <https://doi.org/10.1371%2Fjournal.pone.0237628>.
 
-**Plain english summary:**
+This is a discrete-event simulation modeling patient allocation to dialysis units during the pandemic. The patients need to be transported to a dialysis unit several times a week but, during the pandemic, it was required that COVID positive patients were kept seperate from COVID negative patients. The proposed plan was that all COVID negative units were sent to a particular unit (with a second overflow unit). This study tests that plan with a worst-case scenario of COVID spread over 150 days.
 
-**Model scope:**
+Model patient pathway figure from the original study:
 
-**Reproduction specs:**
+![Patient pathway figure](../original_study/article_fig1.png)
 
-## Steps to reproduce results
+## Scope of the reproduction
 
-### Part 1. Set up environment
+In this assessment, we attempted to reproduced three figures.
 
-There are two options: using conda to install the provided virtual environment on your local machine, or using a docker image containing all the code and the virtual environment.
+![Figure 2. "Patient state over time by unit. The patient population progresses through infection over three months (with 80% infected). The bold line shows the median results of 30 trials, and the fainter lines show the minimum and maximum from the 30 trials."](../original_study/article_fig2.png){width=50%}
 
-**Conda environment:** To use the provided conda environment, run the following in a terminal
+![Figure 3. "Progression of patient population through COVID infection, assuming 80% become infected over three months, with 15% mortality. The figure also shows the number of patients not allocated to a dialysis session at any time. The bold line shows the median results of 30 trials, and the fainter lines show the minimum and maximum from the 30 trials."](../original_study/article_fig3.png){width=50%}
 
-```
-conda env create -f environment.yaml
-```
+![Figure 4. "Patient displacement. The number of patients displaced from their current unit (left panel) and the additional travel time to the unit of care (right panel) for displaced patients. These results do not include those receiving inpatient care. The patient population progresses through infection over three months (with 80% infected). The bold line shows the median results of 30 trials, and the fainter lines show the minimum and maximum from the 30 trials."](../original_study/article_fig4.png){width=50%}
 
-You can use this environment in your preferred IDE, such as VSCode. To use the browser-based JupyterLab, run the following:
+## Reproducing these results
 
-```
-conda activate covid19
-jupyter-lab
+### Repository overview
+
+```bash
+├── data
+│   └──  ...
+├── docker
+│   └──  ...
+├── output
+│   └──  ...
+├── sim
+│   └──  ...
+├── tests
+│   └──  ...
+├── environment.yaml
+├── README.md
+└── reproduction.ipynb
 ```
 
-**Docker:** To use the Docker image, you will need `docker` installed on your local machine. You can then obtain the image by either:
+* `data/` - Data input to the model.
+* `docker/` - Instructions for creation of Docker container.
+* `output/` - Output files from the model.
+* `sim/` - Model code.
+* `tests/` - Test to check that model produces consistent results with our reproduction.
+* `environment.yaml` - Instructions for creation of Conda environment.
+* `README.md` - This file!
+* `reproduction.ipynb` - Notebook which runs model and reproduces items from scope.
 
-* Pulling a pre-built image from the GitHub container registry by running `docker pull ghcr.io/pythonhealthdatascience/covid19:latest`, or
-* Building the image locally from the Dockerfile by running `docker build --tag covid19 .`
+### Step 1. Set up environment
 
-To run the image, you should then issue the following commands in your terminal:
+You'll first want create an environment with the specified version of Python and the required packages installed. There are a few options...
 
-```
-docker run -it -p 8080:80 --name covid19_docker covid19
-conda activate covid19
-jupyter-lab
-```
+Option A: **Conda environment**
+
+> Create the environment using this command in your terminal: `conda env create -f environment.yaml`
+> 
+> You can use this environment in your preferred IDE, such as VSCode. To use the browser-based JupyterLab, activate it using `conda activate covid19`, and then open JupyterLab by running `jupyter-lab`
+
+Option B: **Docker**
+
+> You'll need `docker` installed on your local machine. You can then obtain the image by either:
+>
+> * Pulling a pre-built image from the GitHub container registry by running `docker pull ghcr.io/pythonhealthdatascience/covid19:latest`, or
+> * Building the image locally from the Dockerfile by running `docker build --tag covid19 .`
+>
+> To run the image, you should then issue the following commands in your terminal:
+>
+> * `docker run -it -p 8080:80 --name covid19_docker covid19`
+> * `conda activate covid19`
+> * `jupyter-lab`
+>
+> Then open your browser and go to <https://localhost:8080>. This will open Jupyterlab within the reproduction/ directory.
+
+### Step 2. Running the model
+
+To run the model and produce all three items from the scope, execute the notebook `reproduction.ipynb`.
+
+To check that your results are consistent with our reproduction, run the command `pytest` in your terminal. When doing so, ensure your current directory is the `reproduction/` folder.
+
+## Reproduction specs and runtime
+
+This reproduction was conducted on an Intel Core i7-12700H with 32GB RAM running Ubuntu 22.04.4 Linux.
+
+Expected model runtime (given these specs) is **1m 3s** (as recorded within the notebook).
 
-Then open your browser and go to <https://localhost:8080>. This will open Jupyterlab within the reproduction/ directory.
+## Citation
 
-### Part 2. Running the model
+To cite the original study, please refer to the reference above. To cite this reproduction, please refer to the CITATION.cff file in the parent folder.
 
-The three items in the scope are all produced by running `reproduction.ipynb`.
+## License
 
-Run `pytest` (make sure you current directory is the `reproduction/` folder) to check that model results from your run are as expected.
+This repository is licensed under the MIT License.
\ No newline at end of file
diff --git a/reproduction/reproduction.ipynb b/reproduction/reproduction.ipynb
index 85e57aa..c15f5d5 100644
--- a/reproduction/reproduction.ipynb
+++ b/reproduction/reproduction.ipynb
@@ -4,7 +4,18 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# Reproduction attempt\n",
+    "# Reproducing Allen et al. 2020\n",
+    "\n",
+    "This notebook runs the simulation model and reproduces the three figures from the original study.\n",
+    "\n",
+    "Please refer to the **README.md** file for further description of the model, set-up and reproduction."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Running the model\n",
     "\n",
     "Import required packages."
    ]
@@ -15,8 +26,13 @@
    "metadata": {},
    "outputs": [],
    "source": [
+    "# Packages for model\n",
     "import sim.sim_replicate as sim\n",
-    "from sim.parameters import Scenario"
+    "from sim.parameters import Scenario\n",
+    "\n",
+    "# Additional package to record runtime of this notebook\n",
+    "import time\n",
+    "start = time.time()"
    ]
   },
   {
@@ -49,7 +65,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Run simulation. In the order that these appear, these figures are the attempted reproductions of Figure 2, 3 and 4 respectively."
+    "Run simulation. In the order that these appear, these figures are the attempted reproductions of **Figure 2, 3 and 4 respectively** from the paper."
    ]
   },
   {
@@ -61,7 +77,7 @@
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "Running 30 reps of base_3_month => 0, 2, 4, 1, 5, 9, 8, 6, 12, 11, 13, 3, 7, 14, 18, 16, 17, 15, 19, 10, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, "
+      "Running 30 reps of base_3_month => 0, 1, 3, 12, 13, 6, 15, 4, 5, 11, 7, 8, 14, 16, 10, 9, 2, 18, 19, 17, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, "
      ]
     },
     {
@@ -107,6 +123,28 @@
     "sim.run_replications(\n",
     "    scenarios, number_of_replications, base_random_set)"
    ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Notebook run time: 1m 3s\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Find run time in seconds\n",
+    "end = time.time()\n",
+    "runtime = round(end-start)\n",
+    "\n",
+    "# Display converted to minutes and seconds\n",
+    "print(f'Notebook run time: {runtime//60}m {runtime%60}s')"
+   ]
   }
  ],
  "metadata": {