AlexsLemonade · cbethell · Jan 11, 2023 · Jan 6, 2023 · Jan 10, 2023 · Jan 10, 2023
diff --git a/README.md b/README.md
@@ -6,19 +6,19 @@
 
 - [Core analysis overview](#core-analysis-overview)
 - [Quick Start Guide](#quick-start-guide)
-- [How to install the core downstream analyses workflow](#how-to-install-the-core-downstream-analyses-workflow)
-  - [1) Clone the repository](#1-clone-the-repository)
-  - [2) Install Snakemake](#2-install-snakemake)
-  - [3) Additional dependencies](#3-additional-dependencies)
+- [1. How to install the core downstream analyses workflow](#1-how-to-install-the-core-downstream-analyses-workflow)
+  - [a) Clone the repository](#a-clone-the-repository)
+  - [b) Install Snakemake](#b-install-snakemake)
+  - [c) Additional dependencies](#c-additional-dependencies)
     - [Snakemake/conda installation](#snakemakeconda-installation)
-- [Input data format](#input-data-format)
-- [Metadata file format](#metadata-file-format)
-- [Running the workflow](#running-the-workflow)
+- [2. Input data format](#2-input-data-format)
+- [3. Metadata file format](#3-metadata-file-format)
+- [4. Running the workflow](#4-running-the-workflow)
   - [Project-specific parameters](#project-specific-parameters)
   - [Processing parameters](#processing-parameters)
     - [Filtering parameters](#filtering-parameters)
     - [Dimensionality reduction and clustering parameters](#dimensionality-reduction-and-clustering-parameters)
-- [Expected output](#expected-output)
+- [5. Expected output](#5-expected-output)
   - [What to expect in the output `SingleCellExperiment` object](#what-to-expect-in-the-output-singlecellexperiment-object)
 - [Additional analysis modules](#additional-analysis-modules)
   - [Clustering analysis](#clustering-analysis)
@@ -63,10 +63,14 @@ The workflow can directly take as input the `filtered` RDS files downloaded from
 ```
 snakemake --cores 2 \
   --use-conda \
-  --config results_dir="relative path to relevant results directory" \
-  project_metadata="relative path to your-project-metadata.TSV"
+  --config results_dir="<RELATIVE PATH TO RESULTS DIRECTORY>" \
+  project_metadata="<RELATIVE PATH TO YOUR PROJECT METADATA TSV>"
 ```
 
+Where `results_dir` is the relative path to the results directory where all results from running the workflow will be stored and `project_metadata` is the relative path to the TSV file containing the relevant information about your input files.
+See more information on project metadata in [section 3](#3-metadata-file-format) below.
+**You will want to replace the paths for both `results_dir` and `project_metadata` to successfully run the workflow.** 
+
 **Note** that R 4.1 is required for running our pipeline, along with Bioconductor 3.14.
 Package dependencies for the analysis workflows in this repository are managed using [`renv`](https://rstudio.github.io/renv/index.html), and `renv` must be installed locally prior to running the workflow.
 If you are using conda, dependencies can be installed as [part of the setup mentioned in step 2 above](#snakemakeconda-installation).
@@ -85,9 +89,9 @@ There are two expected output files thay will be associated with each provided `
 
 See the [expected output section](#expected-output) for more information on these output files.
 
-## How to install the core downstream analyses workflow
+## 1. How to install the core downstream analyses workflow
 
-### 1) Clone the repository
+### a) Clone the repository
 
 First you will want to clone the [`scpca-downstream-analyses` repository](https://github.com/AlexsLemonade/scpca-downstream-analyses) from GitHub.
 
@@ -100,15 +104,16 @@ More instructions on cloning a GitHub repository can be found [here](https://doc
 
 Once the repository is successfully cloned, a folder named `scpca-downstream-analyses` containing a local copy of the contents of the repository will be created.
 
-### 2) Install Snakemake
+### b) Install Snakemake
 
 The core downstream single-cell analysis pipeline, which includes filtering, normalization, dimensionality reduction, and clustering is implemented using a Snakemake workflow.
 Therefore, you will also need to install Snakemake before running the pipeline.
+Note that the **minimum** version of Snakemake you will need to have installed to be compatible with conda is version **5.23.0**.
 
 You can install Snakemake by following the [instructions provided in Snakemake's docs](https://snakemake.readthedocs.io/en/v7.3.8/getting_started/installation.html#installation-via-conda-mamba).
 
 Snakemake recommends installing it using the conda package manager.
-Here are the instructions to [install conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/index.html)
+Here are the instructions to [install conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/index.html).
 We recommend the Miniconda installation.
 
 After installing conda, you can follow the steps below to set up the bioconda and conda-forge channels and install Snakemake in an isolated environment:
@@ -125,7 +130,7 @@ mamba create -n snakemake snakemake
 conda activate snakemake
 ```
 
-### 3) Additional dependencies
+### c) Additional dependencies
 
 To run the Snakemake workflow, you will need to have R version 4.2 installed, as well as the `renv` package and pandoc.
 This can be done independently, or you can use Snakemake's conda integration to set up an R environment that the workflow will use.
@@ -150,7 +155,7 @@ To use the environment you have just created, you will need to run Snakemake wit
 
 If you would like to perform installation without the conda environments as described above, see the [independent installation instructions document](./independent-installation-instructions.md).
 
-## Input data format
+## 2. Input data format
 
 The expected input for our core single-cell downstream analysis pipeline is a [`SingleCellExperiment` object](https://rdrr.io/bioc/SingleCellExperiment/man/SingleCellExperiment.html) that has been stored as a RDS file.
 This `SingleCellExperiment` object should contain non-normalized gene expression data with barcodes as the column names and gene identifiers as the row names.
@@ -161,7 +166,7 @@ The pipeline in this repository is setup to process data available on the [Singl
 For more information on the this pre-processing, please see the [ScPCA Portal docs](https://scpca.readthedocs.io/en/latest/).
 Note however that the input for this pipeline is **not required** to be scpca-nf processed output.
 
-## Metadata file format
+## 3. Metadata file format
 
 Now the environment should be all set to implement the Snakemake workflow.
 Before running the workflow, you will need to create a project metadata file as a tab-separated value (TSV) file that contains the relevant data for your input files needed to run the workflow.
@@ -175,7 +180,7 @@ Each library ID should have a unique `filepath`.
 |[View Example Metadata File](https://github.com/AlexsLemonade/scpca-downstream-analyses/blob/main/project-metadata/example-library-metadata.tsv)|
 |---|
 
-## Running the workflow
+## 4. Running the workflow
 
 We have provided an example [snakemake configuration file](https://snakemake.readthedocs.io/en/stable/snakefiles/configuration.html), [`config/config.yaml`](config/config.yaml) which sets the defaults for all parameters needed to run the workflow.
 
@@ -209,12 +214,14 @@ The below code is an example of running the Snakemake workflow using the project
 ```
 snakemake --cores 2 \
   --use-conda \
-  --config results_dir="relative path to relevant results directory" \
-  project_metadata="relative path to your-project-metadata.TSV" \
-  mito_file="full path to your-mito-file.txt"
+  --config results_dir="<RELATIVE PATH TO RESULTS DIRECTORY>" \
+  project_metadata="<RELATIVE_PATH TO YOUR PROJECT METADATA TSV>" \
+  mito_file="<FULL PATH TO MITOCHONDRIAL GENES TXT FILE>"
 ```
 
-**Note:**  If you did not install dependencies [with conda via snakemake](#snakemakeconda-installation), you will need to remove the `--use-conda` flag.
+**You will want to replace the paths for `results_dir` and `project_metadata` to successfully run the workflow.** 
+
+**Note:** If you did not install dependencies [with conda via snakemake](#snakemakeconda-installation), you will need to remove the `--use-conda` flag.
 
 You can also modify the relevant parameters by manually updating the `config/config.yaml` file using a text editor of your choice.
 The project-specific parameters mentioned above can be found under the [`Project-specific parameters` section](./config/config.yaml#L3) of the config file, while the remaining parameters that can be optionally modified are found under the [`Processing parameters` section](./config/config.yaml#L11).
@@ -283,7 +290,7 @@ snakemake --cores 2 \
 Also note that new changes should be merged through a pull request to the `development` branch.
 Changes will be pushed to the `main` branch once changes are ready for a new release (per the [release checklist document](.github/ISSUE_TEMPLATE/release-checklist.md)).
 
-## Expected output
+## 5. Expected output
 
 For each `SingleCellExperiment` and associated `library_id` used as input, the workflow will return two files: a processed `SingleCellExperiment` object containing normalized data and clustering results, and a summary HTML report detailing the filtering of low quality cells, dimensionality reduction, and clustering that was performed within the workflow.
 These files can be found in the `example_results` folder, as defined in the `config.yaml` file.