Skip to content

Tutorial

Riyue Bao edited this page Aug 28, 2020 · 2 revisions

Installation

There are two ways to start using nDSPA.

  • To launch the R shiny app on website:
  1. Go to https://riyuebao.shinyapps.io/ndspa

  2. Start using nDSPA!

  • To launch the R shiny app locally:
  1. download the software from github repository: git clone [email protected]:riyuebao/nDSPA.git

  2. navigate into the nDSPA directory on your computer

  3. within the nDSPA directory, open app.R in Rstudio

  4. once app.R is open in Rstudio, click Run App button on the top right corner

  5. bravo! you are there

nDSPA is under rapid development. Options might change between versions!

Test data

The test data files were simulated from DSP samples that we collected in the lab.

  1. if you are using the online version, download the test data files from the github repo
  2. if you are using the local application, the test files are downloaded along with the code when you do git clone, and in folder testdataSIM.

The test data files provided along with nDSPA are synthetic data simulated from real-world data collected from tumor samples, and should be used for testing software functions only. Do not use the test data for research or clinical questions.

Quick start

QC and Normalization

  1. once you are on the nDSPA page, the first step is to import data. We have provided test data files for a demon run. Click on Import Data", then import two files: (1) Choose the raw data file in the scale of choice - "Browse" and select 01-1.dsp_data.raw.sim.txtfrom thetestdatafolder on your local computer; (2) Input ROI Metadata Table - "Browse" and select01-2.dsp_roi.metadata.sim.txtfrom thetestdata` folder on your local computer.

  2. you have options to turn on/off QC/filter and/or Scale and Normalize. Default is both are off. We recommend turning both on for your analysis.

  3. let's turn on QC/filter by clicking the shifting bar. Now you will see a new panel pops up below the two options, QC filter options and Filtered data. Click QC filter options, you will see the filtering panel is now shown with default parameters.

  4. let's turn on Scale and Normalize. From there, you will see a new panel Scale and Normalization Options pops up at the bottom of the page. Let's change Normalization Method to SNR, leave Select Background Negative Controls for SNR as default, and change Calculation Method for SNR to Geometric Mean.

  5. then let's move to the top of the page, click Data Plots. Click through PCA probes PCA samples HK Corr etc. for the QC and heatmap plots.

  6. below the plots, on the same page, you will see the data tables related to the plots such as Annotations (ROI and sample annotations) All values (raw expression values) Data matrix (expression values resulted from your QC and normalization steps) and Probes (annotation for the nanostring probes of the DSP panel in your experiment).

  7. now let's move on to the top of the page, click Expression Map. This is the spatial bubble plot visualizing gene (or protein) expression of spatially selected ROIs. Click the button and Expression Map Selector panel will show. Click Browse and select P001_1B.png from the testdata folder on your local computer.

  8. if you have multiple scans (images/samples), you can select which one to show by select Scan ID of Image. Here we have two scans, and we have image for scan P001_1B, therefore, let's select P001_1B option here. Then pick your gene of interest by selecting Probe of interest. The expression of this gene will show as scaled bubbles on top of each corresponding ROIs on the image.

Statistical analysis

  1. here, you have options to compare groups of interest and detect genes (or proteins) that are differentially expressed between groups. Because we have multiple ROIs collected per scan per subject, we will use the linear mixed effect model for the statistical comparisons. First, we need to tell the program the grouping assignment of study subjects, e.g., responder (R) or non-responder (NR). For the stats test, we would want to use a second set of test files which have more than one subject per group.

    1.1 First, we need to upload a new expression data table. Go to QC and Normalization.

    1.1.1 Click Browse, and select 02-1.dsp_data.raw.sim.txt as the input for Expression, skip ROI Metadata.

    1.1.2 Then turn on both QC & filter and Scale & normalize.

    1.1.3 Leave QC / filter options as default. Scroll down to the normalization panel, select Normalization method as SNR, and Calculation method for SNR as 'Geomean`.

    1.2 Now, we need to upload the grouping information.

    1.2.1 Go to Statistical Analysis. Click Browse, and select 02-3.dsp_group.sim.txt as the input for Grouping.

    1.2.2. Now you will see more options become available on the page, showing the groups (R and NR). You will also see a table popping up at bottom of the page showing the merged data table (normalized expression values + grouping).

    1.3 Specify the following options for test:

    • Test Method used = t Test (you also have option to select z Test, if sample size is larger)
    • Select Fixed Effect = group (this is your group of interest variable)
    • Select Random Effect = 'subject id` (this is the de-identifier for each subject, e.g. patient 01, patient 02, etc.)
    • Select Static Segment = CD45+ (select which segment you are interested in testing the expression difference. In test data, we only have one segment CD45+, but it is common that researchers may generate data from multiple segments in DSP)
    • First Group for Comparison = NR(select the case group, fold change will be calculated by group 1 vs group 2)
    • Second Group for Comparison = R(select the case group, fold change will be calculated by group 1 vs group 2)

    1.4 Click Run Stat Calculations. After a few seconds, the result of the differentially expressed genes will show up as a new table at the bottom of the page. You can use the Search box to search for your favorite gene, or sort by p.value column in the table.

    1.5 There you go - you have completed the full set of DSP analysis! :)

Notice different studies have very different study designs and you'd want to pick the proper statistical option that best suits your question. If unsure, please consult with a statistician.

Help

For questions and issues, please submit on the github issue page.

Clone this wiki locally