A helpful step-by-step guide to allow you to run MCSDetect on 100s of cells at the same time.
NOTE The below steps are very detailed, but in practice they take ~ 5 minutes or less to execute (minus actual compute time).
- 0 Requirements
- 1 Login to cluster
- 2 Validate dataset
- 3 Schedule dataset
- 4 PostProcess results
- 5 Export output
- 6 Troubleshooting
- Access to the cluster
- SSH setup
- Data is stored on the cluster in the following format (do NOT use spaces in naming please)
- Topdirectory
--- Replicate number (directory named 1, 2, ...)
--- Treatment (directory named e.g. "GP78+", "HT-1080", ...)
--- SeriesXYZ (directory that holds the image data for 1 (segmented deconvolved cell), where XYZ is an integer, e.g. "Series123")
--- channel01.tif (3D 16bit image file containing mitochondria channel, name ending with "1.tif")
--- channel02.tif (ER channel, ending with "2.tif")
Scheduling large (Terabytes) data is time consuming, and you do not want to waste your own time finding out some of your data isn't properly organized. We will validate your dataset to ensure no errors lead to wasted time.
In this guide we will refer to your username on the cluster as `$USER'.
In Linux command line, $VAR references a variable, e.g. $HOME is your home directory, typically /home/$USER
.
ssh $USER@cedar.computecanada.ca
This should result in something like:
[bcardoen@cedar1 ~]$
(bcardoen is my user name, '~' means you're in your home directory
echo $USER
echo $HOME
This will print something like
[bcardoen@cedar1 ~]$ echo $USER
bcardoen
[bcardoen@cedar1 ~]$ echo $HOME
/home/bcardoen
[bcardoen@cedar1 ~]$
We will instruct you to set variables, this is easily done:
export MYVAR="somevalue"
Let's test this, it should show:
[bcardoen@cedar1 ~]$ export MYVAR="somevalue"
[bcardoen@cedar1 ~]$ echo $MYVAR
somevalue
Great, now let's move on
OPTIONAL (but HIGHLY recommended)
Once login is succesful, we will start a tmux
session to ensure any network interruptions do not break your workflow
tmux
If you want to reconnect to an existing session:
tmux -t 0 # to reconnect to session 0
If you want to view which sessions are active
tmux list
Note that there are multiple login servers, if you can't find your session, ensure you login to the right now (cedar1 vs cedar5).
ssh $USER@cedar5.computecanada.ca
This directory will hold intermediate files needed during processing.
export EXPERIMENT="/scratch/$USER/myexperiment"
mkdir -p $EXPERIMENT
cd $EXPERIMENT
For the next step we'll need to download DataCurator to validate your dataset layout. You can obtain it here, but an optimized version is ready for download too:
module load singularity
singularity pull --arch amd64 library://bcvcsert/datacurator/datacurator:latest
chmod u+x datacurator_latest.sif
You'll need your group id, which is of the form rrg-yourpi
or def-yourpi
export MYGROUP="rrg-mypi" # Replace this with something valid for you
salloc --mem=64GB --account=$MYGROUP --cpus-per-task=16 --time=3:00:00
This will log you in to a compute node with 16 cores, 64GB, for 3 hours.
You should see something like this:
[bcardoen@cedar5 myexperiment]$ salloc --mem=62G --account=$MYGROUP --cpus-per-task=16 --time=3:00:00
salloc: Pending job allocation 60557708
salloc: job 60557708 queued and waiting for resources
salloc: job 60557708 has been allocated resources
salloc: Granted job allocation 60557708
salloc: Waiting for resource configuration
salloc: Nodes cdr568 are ready for job
[bcardoen@cdr568 myexperiment]$
Note how the prompt changed from bcardoen@cedar5
to bcardoen@cdr568
, you are now no longer in a login node, but a compute node, where you have the resources (62GB RAM and 16 cores) to use.
DataCurator needs a recipe to verify, this recipe can be found online. Let's download it to our experiment directory
wget https://raw.githubusercontent.com/bencardoen/SubPrecisionContactDetection.jl/main/recipe.toml
This will show something like:
[bcardoen@cdr568 myexperiment]$ wget https://raw.githubusercontent.com/bencardoen/SubPrecisionContactDetection.jl/main/recipe.toml
--2023-02-21 06:32:40-- https://raw.githubusercontent.com/bencardoen/SubPrecisionContactDetection.jl/main/recipe.toml
Resolving raw.githubusercontent.com... 185.199.109.133, 185.199.108.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com|185.199.109.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1308 (1.3K) [text/plain]
Saving to: ‘recipe.toml’
recipe.toml 100%[=========================================================================================================================================>] 1.28K --.-KB/s in 0s
2023-02-21 06:32:40 (63.9 MB/s) - ‘recipe.toml’ saved [1308/1308]
Let's check what we have in our experiment directory:
ls -t . # List all files in our current (.) directory
Should show:
[bcardoen@cdr568 myexperiment]$ ls -t .
recipe.toml datacurator_latest.sif
We need to update this template with the data locations: Let us assume the data you want to process is located in
export DATA="/project/myresearchgroup/mydata"
export OUTPUT="/project/myresearchgroup/myoutput"
You can either do this with an editor, or with these commands:
sed -i "s|INPUT|${DATA}|" recipe.toml # Replace the string 'INPUT' with the correct data directory
sed -i "s|OUTPUT|${OUTPUT}|" recipe.toml # Replace the string 'OUTPUT' with the output directory
If you now check the recipe these 2 lines should be changed to your data
inputdirectory = "INPUT"
# will be
inputdirectory = "/project/myresearchgroup/mydata"
# and
{name="out", aggregator=[[["change_path", "OUTPUT"],"filepath","sort","unique","shared_list_to_file"]]},
# will be
{name="out", aggregator=[[["change_path", "/project/myresearchgroup/myoutput"],"filepath","sort","unique","shared_list_to_file"]]},
NOTE If your channels are 0.tif and 1.tif, rather than 1.tif and 2.tif, please edit the template to reflect this:
sed -i "s|1,2|0,1|" recipe.toml ## Optional if you need to change channels
module load singularity
export SINGULARITY_BINDPATH="/scratch/bcardoen,$SLURM_TMPDIR" # Make sure DC can access the data
export JULIA_NUM_THREADS=$SLURM_CPUS_PER_TASK # Tell DC to use all 16 cores
./datacurator_latest.sif -r recipe.toml # Execute the recipe
This will do the following:
- Build lists for batch processing of all valid files in
in.txt
andout.txt
- Report any data that isn't matching the recipe in
errors.txt
- Compute intensity statistics of all valid data in
channels.csv
- Compute object statistics of all valid data in
objects.csv
Wait for it to complete (depending on your data size 1-20 minutes).
The result when it finished will look somewhat like this
[ Info: 2023-02-21 07:11:27 DataCurator.jl:2271: Writing to channels.csv
[ Info: 2023-02-21 07:11:27 DataCurator.jl:2101: Finished processing dataset located at /home/bcardoen/scratch/MERCS/MS_01_19_2023_3D_STED_SYNJ_FLAG_568_MITO_532_ER_488
[ Info: 2023-02-21 07:11:27 DataCurator.jl:2105: Dataset processing completed without early exit
[ Info: 2023-02-21 07:11:27 curator.jl:168: Writing counters to counters.csv
[ Info: 2023-02-21 07:11:27 curator.jl:180: Complete with exit status proceed
[bcardoen@cdr568 myexperiment]$
Let's review what DataCurator computed for us:
ls -t .
Should give
[bcardoen@cdr568 myexperiment]$ ls -t .
channels.csv counters.csv in.txt objects.csv out.txt errors.txt recipe.toml datacurator_latest.sif
In your current directory you should now have 2 files
- in.txt
- out.txt
We will ask the cluster to process all cells listed in in.txt
, with output to be stored in out.txt
.
Let's get a prepared script to do just that
wget https://raw.githubusercontent.com/bencardoen/SubPrecisionContactDetection.jl/main/hpcscripts/arraysbatch.sh
Make it executable
chmod u+x arraysbatch.ch
You need to change this script to match your account in 3 places:
- ACCOUNT
- Nr of cells
export MYEMAIL="[email protected]"
export MYGROUP="rrg-mypi"
NCELLS=`wc -l in.txt | awk '{print $1}'`
sed -i "s|CELLS|${NCELLS}|" arraysbatch.sh
sed -i "s|EMAIL|${MYEMAIL}|" arraysbatch.sh
sed -i "s|ACCOUNT|${MYGROUP}|" arraysbatch.sh
sed -i "s|1,2|0,1|" arraysbatch.sh ## Optional if you need to change channels
Next, we need to make sure the MCS detect singularity image is in place
singularity pull --arch amd64 mcsdetect.sif library://bcvcsert/subprecisioncontactdetection/mcsdetect_f35_j1.7:j1.8
This should download the file mcsdetect.sif in your current directory.
ls -t .
shows
mcsdetect.sif channels.csv counters.csv in.txt objects.csv out.txt errors.txt recipe.toml datacurator_latest.sif
Set it executable
chmod u+x mcsdetect.sif
Now it's time to submit the job
sbatch arraysbatch.sh
This will be the result
[bcardoen@cdr568 myexperiment]$ sbatch arraysbatch.sh
Submitted batch job 60568508
You will get email updates on job progression, and in the current direcotry .out
files will be saved that contain logs of all the jobs.
For example, for this job the email had the following subject:
Slurm Array Summary Job_id=60568508_* (60568508) Name=arraysbatch.sh Began
Output will be saved in directory $OUTPUT.
squeue -u $USER
Will show you the status of your current running jobs, for example:
JOBID USER ACCOUNT NAME ST TIME_LEFT NODES CPUS TRES_PER_N MIN_MEM NODELIST (REASON)
60566608 bcardoen rrg-hamarneh interactive R 2:39:41 1 16 N/A 62G cdr568 (None)
60568508_[2-58] bcardoen rrg-hamarneh arraysbatch.sh PD 18:00:00 1 6 N/A 116G (Priority)
This shows on the first line the interactive job we're using for validation and the current session, the second line is the 58 cells being processed.
After some time you'll see in your experiment directory the output and logs of the individual jobs. You usually don't need them, but if something goes wrong they can be invaluable
ls -t .
shows
[bcardoen@cdr568 myexperiment]$ ls
arraysbatch.sh counters.csv errors.txt log_02_21_2023_HH08_13.txt log_02_21_2023_HH08_28.txt log_02_21_2023_HH08_30.txt objects.csv recipe.toml slurm-60568508_2.out slurm-60568508_4.out
channels.csv datacurator_latest.sif in.txt log_02_21_2023_HH08_27.txt log_02_21_2023_HH08_29.txt mcsdetect.sif out.txt slurm-60568508_1.out slurm-60568508_3.out slurm-60568508_5.out
Logs are saved in format log_MM_DD_YYYY_HH_MM.txt
and $JOBID_CELLID.out
.
For example, the 5th cell is log file slurm-60568508_5.out
.
You can open these while they are being processed to view progress:
tail -n 10 slurm-60568508_5.out
If something goes wrong you can cancel jobs:
scancel JOBID
You can now run the postprocessing on the output, see instructions.
You can do this with the singularity image (mcsdetect.sif). Let's create a new directory where to store the postprocessing:
export POSTDATA="/scratch/$USER/post"
mkdir -p $POSTDATA
We'll need the data output directory you used earlier:
export OUTPUT="/project/myresearchgroup/myoutput"
Navigate to the directory you created earlier
cd $MYEXPERIMENT
make sure the mcsdetect.sif is in this directory
Download a script for you to do the postprocessing
wget https://raw.githubusercontent.com/bencardoen/SubPrecisionContactDetection.jl/main/hpcscripts/postprocess.sh
chmod u+x postprocess.sh
Update the script with your variables: Please make sure the variables are correctly set as done above
sed -i "s|EMAIL|${MYEMAIL}|" postprocess.sh # Your email
sed -i "s|MYACCOUNT|${MYACCOUNT}|" postprocess.sh # Your cluster account
sed -i "s|POSTOUTPUT|${POSTDATA}|" postprocess.sh # The location where you want the postprocessed results
sed -i "s|INPUT|${OUTPUT}|" postprocess.sh # The computed contacts location, previously saved in $OUTPUT
Then submit
sbatch postprocess.sh
If you want to change the parameters, please edit the script (using nano, vim, ..)
You'll get an email when your results are done.
I recommend using Globus to export results efficiently.
- Log in to Globus using your cluster ID
- Select
$POSTDATA
as the source for the transfer - Select a directory on your target computer as the target
- Execute You'll get an email if the transfer completed.
See wiki
You can use DataCurator to write a recipe that, for example, only collects CSV files of the contacts, and sends them to OwnCloud. Or compute statistics of the contacts, mitochondria, etc, etc. Examples are in DC documentation, and execution is the same as in step 3.
If you run into problems or something is not clear, please create a new issue