hysplit.Rmd

---
title: "hysplit batch processing"
author: "Julien Brun, brun@nceas.ucsb.edu"
output: html_document
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)

# Loading libraries
library(tidyverse)
library(lubridate)
library(foreach)

```

## Paths and files

Because we are processing a lot of large files, we will be relying on different file paths to access:

- your user's github repo
- the shared project directory on the server (Aurora)
- creating a new directory to store the necessary hysplit files

For the large .ARL files, we will be creating Symlinks (symbolic link), which will reference the original files without having to copy them over. We do this because hysplit requires the .ARLs forcing files to be in the same folder as the other files, and they would be time-consuming to copy and use a lot of storage.

```{r}
# creating file paths to different directories

# github repo directory
aurora_user <- "brun" # user folder name on Aurora (commonly your last name)
repo_dir <- getwd() # creates a file path to your github repo to reference below

# folder to save model files in your home directory
run_dir <- "~/run3" # the name of your folder, can change to whatever you prefer
dir.create(run_dir, showWarnings = FALSE) # creates the folder (will not copy over if you run again)

# shared project folder on aurora
shared_dir <- "/home/shares/snapp-wildfire/HYSPLIT_samplefiles" # same for all Aurora users

# WRF files
wrf_dir <-"/home/shares/snapp-wildfire/WRF"

# Copying ASCDATA.CFG file from the shared project directory
file.copy(file.path(shared_dir,"ASCDATA.CFG"), run_dir, overwrite = TRUE)

```
## Hysplit Loop

This loop runs in parallel using the foreach() function. For each month, it:
1) Copies over the EMITMES file
2) Creates the SETUP file
3) Creates the CONTROL file
4) Runs the hysplit model using the system() command

```{r}

# Start cluster to use for parallel computing
nb_cores <- 12  # Aurora has 96 cores
cl <- parallel::makeCluster(nb_cores)
doParallel::registerDoParallel(cl)

# list of ARL files that the loop will cycle through. 
# Must match the names of the .ARL files
# Will use in the created file names (Ex: cdumb.June, SETUP.Aug)

# Year of the run (can be changed in a loop if necessary)
run_year = "1981"

# Build the path for the specific year
arl_year_path <- file.path(wrf_dir, run_year)

# List all the ARL files
arl_files <- list.files(path = arl_year_path, pattern = "ARL")

# Create symlink for the ARL/forcing files
system(sprintf("ln -s %s/*.ARL %s", arl_year_path, file.path(run_dir)))

# List all the EMITFILES for this specific year
emmitimes_files <- list.files(path = shared_dir, pattern = sprintf("^Scenario.*(%s)", run_year))
emmitimes_files

# Main loop
foreach(scenarios_file = emmitimes_files) %dopar% {
  
  # build model run name with extension
  model_run_ext <- gsub("^[^_]*_", replacement = "", scenarios_file) 
  # build model run name without extension
  model_run <- gsub("\\.[^.]*$", replacement = "", model_run_ext) 

  # reading in the functions from the "hysplit_batch_functions" script
  source(file.path(repo_dir,"hysplit_batch_functions.R"))

  # copy the EMITIMES files
  emitimes_file <- file.path(shared_dir, paste0("Scenario_", tolower(model_run_ext)))
  message(emitimes_file)
  emitimes_run <- file.path(run_dir, paste0("EMITIMES", ".", model_run))
  message(emitimes_run)
  file.copy(emitimes_file, emitimes_run, overwrite = TRUE)

  # create the SETUP file
  create_setup <- create_setup(run_dir, extension = model_run, dir_templates = file.path(repo_dir, "file_templates"))

  # Get information from EMITIMES file that is needed in the CONTROL file
  control_info <- read_emitimes(emitimes_run)

  # Create the CONTROL file
  control_filename <- file.path(run_dir, "CONTROL")
  create_control(control_filename,
                 control_info$date,
                 control_info$locations,
                 control_info$runtime,
                 arl_file = arl_files, #paste0(model_run, ".ARL"),
                 arl_dir = "./",
                 extension = model_run,
                 dir_templates =  file.path(repo_dir, "file_templates"))


  #### Run the model #####
  
  repo <- getwd() # saves current working directory
  setwd(run_dir) # saves working directory to the run folder
  system(sprintf("../hysplit/5.0.0/exec/hycs_std %s", model_run)) # runs the command line function for hysplit
  #convert to text file here
  setwd(repo) # resets the working directory

}

# removing the symlink
system(sprintf("rm %s/*.ARL", run_dir))

# stop cluster to end parellelization
parallel::stopCluster(cl)

```