-
Notifications
You must be signed in to change notification settings - Fork 272
v3.4.0 Release Notes, Installation, and Usage
- Release notes
- Prerequisities
- Notes on gradient nonlinearity correction
- Installation
- Getting example data
- Running the HCP Pipelines on example data
- The ICA FIX pipeline
- A note about resource requirements
- Hint for detected Out of Memory conditions
- I still have questions
## Release notes
-
The HCP Pipelines scripts are being released in essentially the form that they were successfully used for processing data provided in the Human Connectome Project 500 Subjects Data Release. That processing was done using 64-bit Red Hat Linux version 5.
-
Some improvements to the documentation have been made since they were used for that release.
-
One known difference between these scripts as they were used for the 500 Subjects Data Release and their currently released form is the version of the
wb_command
from the Connectome Workbench that is used.When these scripts were being used for the 500 Subjects Data Release, Connectome Workbench had not reached its 1.0 release level. Therefore, we used an earlier version of the
wb_command
for processing data for the data release. Distributing that older version of thewb_command
binary as part of this product is less than ideal because that embedded binary is platform specific and is not part of an official Connectome Workbench release.If needed, the Red Hat/CentOS specific version of the
wb_command
binary and it associated libraries used are available in older, tagged, pre-release versions of this project. -
We have begun the process of making the scripts more robust, well-documented, and usable in other environments (other versions of operating systems, other queueing systems, other versions of prerequisite tools, etc.) But that process is not complete. Therefore, we cannot guarantee that you will not have to make modifications to the scripts for them to be used successfully in your environment.
-
A significant part of the value of the open source model of software development is the ability to improve software based on source code level feedback from the community of users. Along those lines, if you find that you do need to make changes to the source code to use the tools successfully, we would welcome feedback and improvement suggestions. (If you have to change something in the scripts to make them work for you, let us know and we'll evaluate how/if to incorporate those changes into the released product.)
Discussion of HCP Pipeline usage and improvements can be posted to the hcp-users discussion list. Sign up for hcp-users at http://humanconnectome.org/contact/#subscribe
-
For the HCP 500 Subjects Data Release, FSL version 5.0.6 was used with these scripts. There is a known issue with using FSL 5.0.7 for the Task fMRI Analysis Pipeline. See the Prerequisites section below for further information.
-
Improvements to the internal documentation of these scripts are planned.
-
Validation tests to ensure that your installation is working correctly with at least the sample data are also planned but not yet available.
## Prerequisites
The HCP Pipelines have the following software requirements:
-
A 64-bit Linux Operating System
-
The FMRIB Software Library (a.k.a. FSL) version 5.0.6 installed and configuration file properly sourced.
NB: This version of the HCP Pipelines requires version 5.0.6 of FSL, not version 5.0.6 or greater. This version of the HCP Pipelines is not fully tested with the any version of FSL other than version 5.0.6. Preliminary testing has detected that there is a difference in behavior between version 5.0.6 of FSL and version 5.0.7 of FSL which, while it is an intentional improvement to FSL, is known to cause the Task Analysis pipeline in particular to fail.
There is currently a separate branch in this repository named
fsl-5.0.7-changes
. That branch is not yet included in a released version of the code, but it contains changes to the Task Analysis pipeline that we expect will fix that pipeline so that it works with version 5.0.7 of FSL. These changes are not fully tested, but they are available to anyone who wants to run the Task Analysis pipeline and use FSL 5.0.7. -
FreeSurfer version 5.3.0-HCP available at ftp://surfer.nmr.mgh.harvard.edu/pub/dist/freesurfer/5.3.0-HCP
NB: You must create and install a license file for FreeSurfer by visiting and submitting the FreeSurfer registration form.
NB: The version of FreeSurfer used is a special release of FreeSurfer which is not part of the normal release cycle. The 5.3.0-HCP version of FreeSurfer contains a slightly different version of the
mris_make_surfaces
program than is part of the standard FreeSurfer 5.3.0 release. -
Connectome Workbench version 1.0
The HCP Pipelines scripts use the
wb_command
which is part of the Connectome Workbench. They locate thewb_command
using an environment variable. Instructions for setting this environment variable are provided below in the Running the HCP Pipelines on example data section. -
The HCP version of gradunwarp version 1.0.2 (if gradient nonlinearity correction is to be done.)
## Notes on gradient nonlinearity correction
-
Gradient Nonlinearity Correction is sometimes also referred to as Gradient Distortion Correction or GDC
-
As is true of the other prerequisite pieces of software, the HCP version of gradunwarp has its own set of prerequisites. See the HCP gradunwarp README file for those prerequisites.
-
In order to run HCP gradunwarp, you will need a gradient coefficients file to use as an input to the gradient distortion correction process. Please see questions 7 and 8 in the HCP Pipelines FAQ for further information about gradient nonlinearity correction and obtaining a gradient coefficients file.
-
The HCP Pipelines scripts expect to be able to find the main module of the gradunwarp tool (gradient_unwarp.py) within a directory specified in the
PATH
environment variable. -
As distributed, the examples scripts that serve as templates for running various types of pipeline processing are set to not run gradient distortion correction. Commented out portions of those scripts illustrate how to change the variable settings to perform gradient distortion correction. These commented out portions assume that you have placed the gradient coefficients file in the standard configuration directory for your installation of HCP Pipelines (the
global/config
directory within your HCP Pipelines installation directory.
## Installation
-
Install the listed prerequisites first.
-
Installation Notes for FSL
-
Once you have installed FSL, verify that you have the correct version of FSL by simply running the
$ fsl
command. The FSL window that shows up should identify the version of FSL you are running in its title bar. -
Sometimes FSL is installed without the separate documentation package, it is most likely worth the extra effort to install the FSL documentation package.
-
-
Ubuntu Installation Notes for FreeSurfer
-
For Linux, FreeSurfer is distributed in gzipped tarballs for CentOS 4 and CentOS 6.
-
The instructions here provide guidance for installing FreeSurfer on Ubuntu. If following the instructions there, be sure to download version 5.3.0-HCP of FreeSurfer and not version 5.1.0 as those instructions indicate.
-
Ubuntu (at least starting with version 12.04 and running through version 14.04 LTS) is missing a library that is used by some parts of FreeSurfer. To install that library enter
$ sudo apt-get install libjpeg62
.
-
-
-
Download the necessary compressed tar file (.tar.gz) for the HCP Pipelines release.
-
Move the compressed tar file that you download to the directory in which you want the HCP Pipelines to be installed, e.g.
$ mv Pipelines-3.4.0.tar.gz ~/projects
-
Extract the files from the compressed tar file, e.g.
$ cd ~/projects $ tar xvf Pipelines-3.4.0.tar.gz
-
This will create a directory containing the HCP Pipelines, e.g.
$ cd ~/projects/Pipelines-3.4.0 $ ls -F DiffusionPreprocessing/ fMRIVolume/ PostFreeSurfer/ TaskfMRIAnalysis/ Examples/ FreeSurfer/ PreFreeSurfer/ tfMRI/ FAQ.md global/ product.txt VersionHistory.md* fMRISurface/ LICENSE.md* README.md version.txt $
-
This newly created directory is your HCP Pipelines Directory.
In this documentation, in documentation within the script files themselves, and elsewhere, we will use the terminology HCP Pipelines Directory interchangably with
HCPPIPEDIR
,$HCPPIPEDIR
, or${HCPPIPEDIR}
.More specifically,
$HCPPIPEDIR
and${HCPPIPEDIR}
refer to an environment variable that will be set to contain the path to your HCP Pipelines Directory.
## Getting example data
Example data for becoming familiar with the process of running the HCP Pipelines and testing your installation is available from the Human Connectome Project.
If you already have (or will be obtaining) the gradient coefficients file for the Connectome Skyra scanner used to collect the sample data and want to run the pipelines including the steps which perform gradient distortion correction, you can download a zip file containing example data here.
In that case, you will need to place the obtained gradient coefficients
file (coeff_SC72C_Skyra.grad
) in the global/config
directory within
your HCP Pipelines Directory.
If you do not have and are not planning to obtain the gradient coefficients file for the Connectome Skyra scanner used to collect the sample data and want to run the pipelines on files on which gradient distortion correction has already been performed, you should should download a zip file containing example data here.
The remainder of these instructions assume you have extracted the example data
into the directory ~/projects/Pipelines_ExampleData
. You will need to
modify the instructions accordingly if you have extracted the example data
elsewhere.
## Running the HCP Pipelines on example data ### Structural preprocessing
Structural preprocessing is subdivided into 3 parts (Pre-FreeSurfer processing,
FreeSurfer processing, and Post-FreeSurfer processing). These 3 steps should
be excuted in the order specified, and each of these 3 parts is implemented
as a separate bash
script.
In the ${HCPPIPEDIR}/Examples/Scripts
directory, you will find
a shell script for running a batch of subject data through the
Pre-FreeSurfer part of structural preprocessing. This shell script is
named: PreFreeSurferPipelineBatch.sh
. You should review and possibly
edit that script file to run the example data through the
Pre-FreeSurfer processing.
StudyFolder
The setting of the StudyFolder
variable near the top of this script
should be verified or edited. This variable should contain the path to a
directory that will contain data for all subjects in subdirectories named
for each of the subject IDs.
As distributed, this variable is set with the assumption that you have
extracted the sample data into a directory named projects/Pipelines_ExampleData
within your login or "home" directory.
StudyFolder="${HOME}/projects/Pipelines_ExampleData"
You should either verify that your example data is extracted to that location or modify the variable setting accordingly.
Subjlist
The setting of the Subjlist
variable, which comes immediately
after the setting of the StudyFolder
variable, should also be
verified or edited. This variable should contain a space delimited
list of the subject IDs for which you want the Pre-FreeSurfer processing
to run.
As distributed, this variable is set with the assumption that you will
run the processing only for the single example subject, which has a
subject ID of 100307
.
Subjlist="100307"
Using this value in conjunction with the value of the StudyFolder
variable,
the script will look for a directory named 100307
within the directory
${HOME}/projects/Pipelines_ExampleData
. This is where it will expect to
find the data it is to process.
You should either verify that your example data is in that location or modify the variable setting accordingly.
EnvironmentScript
The EnvironmentScript
variable should contain the path to a
script that sets up the environment variables that are necessary
for running the Pipeline scripts.
As distributed, this variable is set with the assumption that you
have installed the HCP Pipelines in the directory
${HOME}/projects/Pipelines
(i.e. that your HCP Pipelines directory
is ${HOME}/projects/Pipelines
) and that you will use the
example environment setup provided in the
Examples/Scripts/SetUpHCPPipeline.sh
script.
You may need to update the setting of the EnvironmentScript
variable to reflect where you have installed the HCP Pipelines.
GradientDistortionCoeffs
Further down in the script, the GradientDistortionCoeffs
variable
is set. This variable should be set to contain either the path to
the gradient coefficients file to be used for gradient distortion
correction or the value NONE
to skip over the gradient distortion
correction step.
As distributed, the script sets the variable to skip the gradient distortion correction step.
You will need to update the setting of this variable if you have a gradient coefficients file to use and want to perform the gradient distortion correction
HCPPIPEDIR
and the SetUpHCPPipeline.sh
script
The script file referenced by the EnvironmentScript
variable
in the PreFreeSurferPipelineBatch.sh
file (by default the
SetUpHCPPipeline.sh
file in the Examples\Scripts folder)
does nothing but establish values for all the environment
variables that will be needed by various pipeline scripts.
Many of the environment variables set in the SetUpHCPPipeline.sh
script are set relative to the HCPPIPEDIR
environment variable.
As distributed, the setting of the HCPPIPEDIR
environment
variable assumes that you have installed the HCP Pipelines
in the ${HOME}/projects/Pipelines
directory. You may
need to change this to reflect your actual installation
directory.
As distributed, the SetUpHCPPipeline.sh
script assumes
that you have:
- properly installed FSL
- set the FSLDIR environment variable
- sourced the FSL configuration script
- properly installed FreeSurfer
- set the FREESURFER_HOME environment variable
- sourced the FreeSurfer setup script
Example statements for setting FSLDIR, sourcing the
FSL configuration script, setting FREESURFER_HOME,
and sourcing the FreeSurfer setup script are provided
but commented out in the SetUpHCPPipeline.sh
script
prior to setting HCPPIPEDIR
.
CARET7DIR
in the SetUpHCPPipeline.sh
script
The CARET7DIR
variable must provide the path to the directory
in which to find the Connectome Workbench wb_command
. As
distributed, the CARET7DIR
is set with the assumption that
the necessary wb_command
binary is installed in the
${HOME}/workbench/bin_linux64
directory.
It is very likely that you will need to change the value of the
CARET7DIR
environment variable to indicate the location of your
installed version of the wb_command
.
Running the Pre-FreeSurfer processing after editing the setup script
Once you have made any necessary edits as described above, Pre-FreeSurfer processing can be invoked by commands similar to:
$ cd ~/projects/Pipelines/Examples/Scripts
$ ./PreFreeSurferPipelineBatch.sh
This script must be SOURCED to correctly setup the environment
prior to running any of the other HCP scripts contained here
100307
Found 1 T1w Images for subject 100307
Found 1 T2w Images for subject 100307
About to use fsl_sub to queue or run /home/user/projects/Pipelines/PreFreeSurfer/PreFreeSurferPipeline.sh
After reporting the number of T1w and T2w images found, the
PreFreeSurferPipelineBatch.sh
script uses the FSL command fsl_sub
to submit a processing job which ultimately runs the PreFreeSurferPipeline.sh
pipeline script.
If your system is configured to run jobs via an Oracle Grid Engine cluster
(previously known as a Sun Grid Engine (SGE) cluster), then fsl_sub
will
submit a job to run the PreFreeSurferPipeline.sh
script on the
cluster and then return you to your system prompt. You can check on the
status of your running cluster job using the qstat
command. See the
documentation of the qstat
command for further information.
The standard output (stdout) and standard error (stderr) for the job
submitted to the cluster will be redirected to files in the directory
from which you invoked the batch script. Those files will be named
PreFreeSurferPipeline.sh.o<job-id>
and
PreFreeSurferPipeline.sh.e<job-id>
respectively, where
<job-id>
is the cluster job ID. You can monitor the progress
of the processing with a command like:
$ tail -f PreFreeSurferPipeline.sh.o1434030
where 1434030 is the cluster job ID.
If your system is not configured to run jobs via an Oracle Grid Engine
cluster, fsl_sub
will run the PreFreeSurferPipeline.sh
script
directly on the system from which you launched the batch script.
Your invocation of the batch script will appear to reach a point
at which "nothing is happening." However, the PreFreeSurferPipeline.sh
script will be launched in a separate process and the
standard output (stdout) and standard error (stderr) will have
been redirected to files in the directory from which you invoked
the batch script. The files will be named
PreFreeSurferPipeline.sh.o<process-id>
and
PreFreeSurferPipeline.sh.e<process-id>' respectively, where
` is the operating system assigned unique process
ID for the running process.
A similar tail
command to the one above will allow you to monitor
the progress of the processing.
Keep in mind that depending upon your processor speed and whether or not you are performing gradient distortion correction, the Pre-FreeSurfer phase of processing can take several hours.
In the ${HCPPIPEDIR}/Examples/Scripts
directory, you will find
a shell script for running a batch of subject data through the
FreeSurfer part of structural preprocessing. This shell script is
named: FreeSurferPipelineBatch.sh
. You should review and possibly
edit that script file to run the example data through the
FreeSurfer processing.
The StudyFolder
, Subjlist
, and EnvironmentScript
variables
are set near the top of the script and should be verified and edited
as indicated above in the discussion of Pre-FreeSurfer processing.
Your environment script (SetUpHCPPipeline.sh
) will need to have
the same environment variables set as for the Pre-FreeSurfer
processing.
Once you have made any necessary edits as described above and the Pre-FreeSurfer processing has completed, then invoking FreeSurfer processing is quite similar to invoking Pre-FreeSurfer processing. The command will be similar to:
$ cd ~/projects/Pipelines/Examples/Scripts
$ ./FreeSurferPipelineBatch.sh
This script must be SOURCED to correctly setup the environment
prior to running any of the other HCP scripts contained here
100307
About to use fsl_sub to queue or run /home/user/projects/Pipelines/FreeSurfer/FreeSurferPipeline.sh
As above, the fsl_sub
command will either start a new process on the current
system or submit a job to an Oracle Grid Engine cluster. Also as above, you can
monitor the progress by viewing the generated standard output and standard error
files.
In the ${HCPPIPEDIR}/Examples/Scripts
directory, you will find
a shell script for running a batch of subject data through the
Post-FreeSurfer part of structural preprocessing. This shell script
is named: PostFreeSurferPipelineBatch.sh
. This script follows
the same pattern as the batch scripts to run Pre-FreeSurfer and
FreeSurfer processing do. That is, you will need to verify/edit
the StudyFolder
, Subjlist
, and EnvironmentScript
variables
that are set at the top of the script, and your environment
script (SetUpHCPPipeline.sh
) will need to set environment variables
appropriately.
There is an additional variable in the PostFreeSurferPipelineBatch.sh
script that needs to be considered. The RegName
variable tells
the pipeline whether to use MSMSulc for surface alignment. (See
the FAQ for further information about MSMSulc.) As distributed,
the PostFreeSurferPipelineBatch.sh
script assumes that you do not
have access to the msm
binary to use for surface alignment. Therefore,
the RegName
variable is set to "FS"
.
If you do have access to the msm
binary and wish to use it, you
can change this variable's value to "MSMSulc"
. If you do so, then
your environment script (e.g. SetUpHCPPipeline.sh
) will need to set
an additional environment variable which is used by the pipeline
scripts to locate the msm
binary. That environment variable is
MSMBin
and it is set in the distributed example SetUpHCPPipeline.sh
file as follows:
export MSMBin=${HCPPIPEDIR}/MSMBinaries
You will need to either place your msm
executable binary file
in the ${HCPPIPEDIR}/MSMBinaries
directory or modify the
value given to the MSMBin
environment variable so that it
contains the path to the directory in which you have placed
your copy of the msm
executable.
Once those things are taken care of and FreeSurfer processing is completed, commands like the following can be issued:
$ cd ~/projects/Pipelines/Examples/Scripts
$ ./PostFreeSurferPipelineBatch.sh
This script must be SOURCED to correctly setup the environment
prior to running any of the other HCP scripts contained here
100307
About to use fsl_sub to queue or run /home/user/projects/Pipelines/PostFreeSurfer/PostFreeSurferPipeline.sh
The fsl_sub
command used in the batch script will behave as described
above and monitoring the progress of the run can be done as described
above.
Diffusion Preprocessing depends on the outputs generated by Structural Preprocessing. So Diffusion Preprocessing should not be attempted on data sets for which Structural Preprocessing is not yet complete.
The DiffusionPreprocessingBatch.sh
script in the ${HCPPIPEDIR}/Examples/Scripts
directory is much like the example scripts for the 3 phases of Structural
Preprocessing. The StudyFolder
, Subjlist
, and EnvironmentScript
variables
set at the top of the batch script need to be verified or edited as above.
Like the PreFreeSurferPipelineBatch.sh
script, the DiffusionPreprocessingBatch.sh
also needs a variable set to the path to the gradient coefficients file or NONE
if gradient distortion correction is to be skipped. In the DiffusionPreprocessingBatch.sh
script that variable is Gdcoeffs
. As distributed, this example script is setup
with the assumption that you will skip gradient distortion correction. If you have
a gradient coefficients file available and would like to perform gradient distortion
corrrection, you will need to update the Gdcoeffs
variable to contain the path
to your gradient coefficients file.
Functional Preprocessing depends on the outputs generated by Structural Preprocessing. So Functional Preprocessing should not be attempted on data sets for which Structural Preprocessing is not yet complete.
Functional Preprocessing is divided into 2 parts: Generic fMRI Volume Preprocessing and Generic fMRI Surface Preprocessing. Generic fMRI Surface Preprocessing depends upon output produced by the Generic fMRI Volume Preprocessing. So fMRI Surface Preprocessing should not be attempted on data sets for which fMRI Volume Preprocessing is not yet complete.
As is true of the other types of preprocessing discussed above, there are example scripts for running each of the two types of Functional Preprocessing.
The GenericfMRIVolumeProcessingPipelineBatch.sh
script is the starting
point for running volumetric functional preprocessing. Like the sample
scripts mentioned above, you will need to verify or edit the StudyFolder
,
Subjlist
, and EnvironmentScript
variables defined near the top of the
batch processing script. Additionally, you will need to verify or
edit the GradientDistortionCoeffs
variable. As distributed, this
value is set to "NONE" to skip gradient distortion correction.
In addition to these variable modifications, you should check or edit
the contents of the Tasklist
variable. This variable holds a space
delimited list of the functional tasks that you would like preprocessed.
As distributed, the Tasklist
variable is set to only process the 2
HCP EMOTION tasks (tfMRI_EMOTION_RL
and tfMRI_EMOTION_LR
). You
can add other tasks (e.g. tfMRI_WM_RL
, tfMRI_WM_LR
, tfMRI_SOCIAL_RL
,
tfMRI_SOCIAL_RL
, etc.) to the list in the Tasklist
variable to get
volume preprocessing to occur on those tasks as well. The Resting
State "tasks" can also be preprocessed this way too by adding the
specification of those tasks (e.g. rfMRI_REST1_RL
, rfMRI_REST1_LR
,
rfMRI_REST2_RL
, or rfMRI_REST2_LR
) to the Tasklist
variable.
The GenericfMRISurfaceProcessingPipelineBatch.sh
script is the starting
point for running surface based functional preprocessing. As has been
the case with the other sample scripts, you will need to verify or edit
the StudyFolder
, Subjlist
, and EnvironmentScript
variables defined
near the top of the batch processing script.
In addition to these variable modifications, you should check or edit
the contents of the Tasklist
variable. This variable holds a space
delimited list of the functional tasks that you would like preprocessed.
As distributed, the Tasklist
variable is set to only process the 2
HCP EMOTION tasks (tfMRI_EMOTION_RL
and tfMRI_EMOTION_LR
). As
above in the volume based functional preprocessing, you can add other
tasks to the list in the Tasklist
variable including other
actual tasks or the Resting State "tasks".
Like the Post-FreeSurfer pipeline, you will also need to set
the RegName
variable to either MSMSulc
or FS
.
Task fMRI (tfMRI) data can be further processed (after Functional
Preprocessing is complete) using the TaskfMRIAnalysisBatch.sh
script. The TaskfMRIAnalysisBatch.sh
script runs Level 1 and
Level 2 Task fMRI Analysis. As has been the case with the other
sample scripts, you will need to verify or edit the StudyFolder
,
Subjlist
, and EnvironmentScript
variables defined at the top
of this batch processing script.
In addition to these variable modifications, you should check or edit
the contents of the LevelOneTasksList
, LevelOneFSFsList
, LevelTwoTaskList
,
and LevelTwoFSFList
variables. As distributed, these variables are
configured to perform Level 1 task analysis only on the RL and LR conditions
for the EMOTION task and Level 2 task analysis on the combined results of
the RL and LR Level 1 analysis for the EMOTION task. You can add other
conditions for Level 1 and Level 2 analysis by altering the settings of
these variables. Please be aware that changing these settings will
alter the types of analysis that are done for all subjects listed
in the Sublist
variable.
Note: If, instead of starting with unprocessed data and doing Structural Preprocessing and Functional Preprocessing yourself as is described in this document, you are starting with already Structurally and Functionally Preprocessed data as supplied by the HCP, the FSF files described below that are necessary for both Level 1 and Level 2 Task Analysis have already been created and supplied in the package of Functionally Preprocessed data supplied by HCP.
Level 1 Task Analysis requires FEAT setup files (FSF files) for each direction of the functional task.
For example, to perform Level 1 Task Analysis for the tfMRI_EMOTION_RL
and tfMRI_EMOTION_LR
tasks for subject 100307
, the following FEAT setup
files must exist before running the Task Analysis pipeline:
<StudyFolder>/100307/MNINonLinear/Results/tfMRI_EMOTION_LR/tfMRI_EMOTION_LR_hp200_s4_level1.fsf
<StudyFolder>/100307/MNINonLinear/Results/tfMRI_EMOTION_RL/tfMRI_EMOTION_RL_hp200_s4_level1.fsf
Templates for these files can be found in the ${HCPPIPEDIR}/Examples/fsf_templates
directory. The number of time points entry in the template for each
functional task must match the actual number of time points in the corresponding scan.
An entry in the tfMRI_EMOTION_LR_hp200_s4_level1.fsf
file might look like:
# Total volumes
set fmri(npts) 176
The 176
value must match the number of volumes (number of time points) in the
corresponding tfMRI_EMOTION_LR.nii.gz
scan file. After you have copied the appropriate
template for a scan to the indicated location in the MNINonLinear/Results/<task>
directory,
you may have to edit the .fsf file to make sure the value that it has for Total volumes
matches the number of time points in the corresponding scan file.
There is a script in the ${HCPPIPEDIR}/Examples/Scripts
directory named generate_level1_fsf.sh
which can be either studied or used directly to retrieve the number of time points from an image
file and set the correct Total volumes
value in the .fsf file for a specified task.
Typical invocations of the generate_level1_fsf.sh
script would look like:
$ cd ${HCPPIPEDIR}/Examples/Scripts
$ ./generate_level1_fsf.sh \
> --studyfolder=${HOME}/projects/Pipelines_ExampleData \
> --subject=100307 --taskname=tfMRI_EMOTION_RL --templatedir=../fsf_templates \
> --outdir=${HOME}/projects/Pipelines_ExampleData/100307/MNINonLinear/Results/tfMRI_EMOTION_RL
$ ./generate_level1_fsf.sh \
> --studyfolder=${HOME}/projects/Pipelines_ExampleData \
> --subject=100307 --taskname=tfMRI_EMOTION_LR --templatedir=../fsf_templates \
> --outdir=${HOME}/projects/Pipelines_ExampleData/100307/MNINonLinear/Results/tfMRI_EMOTION_LR
This must be done for every direction of every task for which you want to perform task analysis.
Level 1 Task Analysis also requires that E-Prime EV files be available in the
MNINonLinear/Results
subdirectory for the each task on which Level 1 Task Analysis is to occur.
These EV files are available in the example unprocessed data, but are not in the
MNINonLinear/Results
directory because that directory is created as part of Functional
Preprocessing. Since Functional Preprocessing must be completed before Task Analysis
can be performed, the MNINonLinear/Results
folder should exist prior to Task Analysis.
There is a script in the ${HCPPIPEDIR}/Examples/Scripts
directory named copy_evs_into_results.sh
.
This script can be used to copy the necessary E-Prime EV files for a task into the
appropriate place in the MNINonLinear/Results
directory.
Typical invocations of the copy_evs_into_results.sh
script would look like:
$ cd ${HCPPIPEDIR}/Examples/Scripts
$ ./copy_evs_into_results.sh \
> --studyfolder=${HOME}/projects/Pipelines_ExampleData \
> --subject=100307 \
> --taskname=tfMRI_EMOTION_RL
$ ./copy_evs_into_results.sh \
> --studyfolder=${HOME}/projects/Pipelines_ExampleData \
> --subject=100307 \
> --taskname=tfMRI_EMOTION_LR
This must be done for every directory of every task for which you want to perform task analysis.
Level 2 Task Analysis requires a FEAT setup file also. For example, to perform
Level 2 Task Analysis for the tfMRI_EMOTION
task for subject 100307
(combination
data from tfMRI_EMOTION_RL
and tfMRI_EMOTION_LR
) the following FEAT setup file
must exist before running the Task Analysis pipeline:
<StudyFolder>/100307/MNINonLinear/Results/tfMRI_EMOTION/tfMRI_EMOTION_hp200_s4_level2.fsf
The template file named tfMRI_EMOTION_hp200_s4_level2.fsf
in the
${HCPPIPEDIR}/Examples/fsf_templates
directory can be copied, unchanged
to the appropriate location before running the Task Analysis pipeline. You will likely
have to create the level 2 results directory, e.g. <StudyFolder>/100307/MNINonLinear/Results/tfMRI_EMOTION
(Notice that this directory name does not end with _LR
or _RL
) before you can copy the template
into that directory.
## The ICA FIX pipeline
Resting state fMRI (rfMRI) data can be further processed (after Functional Preprocessing is complete) using the FMRIB group's ICA-based Xnoiseifer - FIX (ICA FIX). This processing regressess out motion timeseries and artifact ICA components (ICA run using Melodic and components classified using FIX): (Salimi-Khorshidi et al 2014)
The downloadable FIX tar file
includes the hcp_fix
file which is a wrapper script for running
ICA FIX on data that has been run through the HCP Structural and Functional
Preprocessing. The hcp_fix script is run with a high-pass filter of 2000.
## A note about resource requirements
The memory and processing time requirements for running the HCP Pipelines scripts is relatively high. To provide an reference point, when the HCP runs these scripts to process data by submitting them to a cluster managed by a Portable Batch System (PBS) job scheduler, we generally request the following resource limits.
-
Structural Preprocessing (Pre-Freesurfer, FreeSurfer, and Post-FreeSurfer combined)
- Walltime
- Structural Preprocessing usually finishes within 24 hours
- We set the walltime limit to 24-48 hours and infrequently have to adjust it up to 96 hours
- Memory
- We expect Structural Preprocessing to have maximum memory requirements in the range of 12 GB. But infrequently we have to adjust the memory limit up to 24 GB.
- Walltime
-
Functional Preprocessing (Volume and Surface based preprocessing combined)
- Time and memory requirements vary depending on the length of the fMRI scanning session
- In our protocol, resting state functional scans (rfMRI) are longer duration than task functional scans (tfMRI) and therefore have higher time and memory requirements.
- Jobs processing resting state functional MRI (rfMRI) scans usually have walltime limits in the range of 36-48 hrs and memory limits in the 20-24 GB range.
- Jobs processing task functional MRI (tfMRI) scans may have resource limits that vary based on the task's duration, but generally are walltime limited to 24 hours and memory limited to 12 GB.
- Often tfMRI preprocessing takes in the neighborhood of 4 hours and rfMRI preprocessing takes in the neighborhood of 10 hours.
-
Diffusion Preprocessing
- Time and memory requirements for Diffusion preprocessing will depend upon
whether you are running the
eddy
portion of Diffusion preprocessing using the Graphics Processing Unit (GPU) enabled version of theeddy
binary that is part of FSL. (The FMRIB group at Oxford University recommends using the GPU-enabled version ofeddy
whenever possible.) - Memory requirements for Diffusion preprocessing are generally in the 24-50 GB range.
- Walltime requirements can be as high as 36 hours.
- Time and memory requirements for Diffusion preprocessing will depend upon
whether you are running the
-
Task fMRI Analysis
- Walltime limits on Task fMRI Analysis are generally set to 24 hours with the actual expected walltimes to run from 4-12 hours per task.
- Memory limits are set at 12 GB.
The limits listed above, in particular the walltime limits listed, are only useful if you have some idea of the capabilities of the computer node on which the jobs were run. For information about the configuration of the cluster nodes used to come up with the above limits/requirements, see the description of the equipment available at the Washington University Center for High Performance Computing (CHPC) hardware resources.
## Hint for detecting Out of Memory conditions
If one of your preprocessing jobs ends in a seemingly inexplicable way with a message
in the stderr file (e.g. DiffPreprocPipeline.sh.e<process-id>
) that indicates that
your process was Killed
, it is worth noting that many versions of Linux have a process
referred to as the Out of Memory Killer or OOM Killer. When a system running an
OOM Killer gets critically low on memory, the OOM Killer starts killing processes by
sending them a -9
signal. This type of process killing immediately stops the process
from running, frees up any memory that process is using, and causes the return value
from the killed process to be 137
. (By convention, this return value is 128 plus
the signal number, which is 9. Thus a return value of 128+9=137.)
For example, if the eddy
executable used in Diffusion preprocessing attempts to
allocate more memory than is available, it may be killed by the OOM Killer and return
a status code of 137. In that case, there may be a line in the stderr that looks
similar to:
/home/username/projects/Pipelines/DiffusionPreprocessing/scripts/run_eddy.sh: line 182: 39455 Killed ...*further info here*...
and a line in the stdout that looks similar to:
Sat Aug 9 21:20:21 CDT 2014 - run_eddy.sh - Completed with return value: 137
Lines such as these are a good hint that you are having problems with not having
enough memory. Out of memory conditions and the subsequent killing of jobs
by the OOM Killer can be confirmed by looking in the file where the OOM Killer
logs its activities (/var/log/kern.log
on Ubuntu systems, /var/log/messages*
on some other systems).
Searching those log files for the word Killed
may help find the log message
indicating that your process was killed by the OOM Killer. Messages within
these log files will often tell you how much memory was allocated by the
process just before it was killed.
Note that within the HCP Pipeline scripts, not all invocations of binaries or other scripts print messages to stderr or stdout indicating their return status codes. The above example is from a case in which the return status code is reported. So while this hint is intended to be helpful, it should not be assumed that all out of memory conditions can be discovered by searching the stdout and stderr files for return status codes of 137.
## I still have questions
Please review the FAQ.