Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about how to interpret rho and FDR #17

Open
taylorreiter opened this issue Sep 1, 2020 · 2 comments
Open

Question about how to interpret rho and FDR #17

taylorreiter opened this issue Sep 1, 2020 · 2 comments
Labels
helpful This question has been marked as potentially helpful to others. question

Comments

@taylorreiter
Copy link

Hi @tpq, thank you for the wonderful CoDa methods you have implemented in propr! I read Microbiome Datasets Are Compositional: And This Is Not Optional at the beginning of my PhD and am excited to be using the methods now.

I've read your the following posts/papers, but I still have some questions about how to interpret and use propr.

Is there an analogy to draw between propr and e.g. R^2 from spearman correlation, where abs(rho@result$propr) indicates higher proportionality? I know the math is completely different, but I have no intuition for the meaning of propr or lrv. (I've seen you state that you're skeptical of negative rho, so I'm assuming these should just be ignored.)

    Partner Pair        lrv metric alpha        propr
1         2    1  1.3139167    rho  <NA>  0.809043652
2         3    1  2.4083228    rho  <NA>  0.711525696
3         3    2  3.0522840    rho  <NA>  0.637794103
4         4    1  3.8199330    rho  <NA>  0.600589599
5         4    2  4.2658219    rho  <NA>  0.557596470
6         4    3  2.2894219    rho  <NA>  0.793934072
7         5    1 17.3747962    rho  <NA>  0.050633754
8         5    2 15.2983714    rho  <NA>  0.167658262
9         5    3 22.5732504    rho  <NA> -0.137324292
10        5    4 26.1975924    rho  <NA> -0.243765501

I also don't understand how to interpret FDR from updateCutoffs(rho). Currently I'm arbitrarily using rho95 <- rho[">", .95] because I'm not sure how to interpret this table. I think this is telling me I could use 0.65 given that the FDR falls below .05 at this value...but I wanted to check.

  cutoff randcounts truecounts          FDR
1   0.05   11936.30      10353 1.1529315174
2   0.35     485.26       2443 0.1986328285
3   0.65       0.03        299 0.0001003344
4   0.95       0.00         19 0.0000000000

Lastly, I've seen your implementation for multi-omics data sets here and have it working. I have 16s, ITS, and metatranscriptomic data from the same N samples. I understand how to calculate proportionality between these datasets using your implementation, but I'm curious if I can calculate proportionality between environmental variables as well...we also took measurements like pH, nitrogen, latitude, longitude, etc., and I would like to know if any ASVs/transcripts are proportional to these variables.

@tpq
Copy link
Owner

tpq commented Sep 1, 2020

Hey Taylor!

Thanks for your interest in propr, and for asking such detailed questions!

I'll try to answer them quickly now, but I also invite follow-up questions.

(1)

Is there an analogy to draw between propr and e.g. R^2 from spearman correlation, where abs(rho@result$propr) indicates higher proportionality?

There is some analogy to draw between the two, and indeed higher values indicate stronger associations. The strength of the associations are defined with respect to the CLR transformation. When rho=1 for a pair i,j -- a plot of OTU i vs. OTU j will give a perfect line where the slope is 1 for OTUs expressed in CLR coordinates. (It can be helpful to think of the CLR as a kind of normalization that scales abundances according to the average for a sample. From this perspective, propr is sort of like a correlation between normalized variables.).

(2)

I also don't understand how to interpret FDR from updateCutoffs(rho). Currently I'm arbitrarily using rho95 <- rho[">", .95] because I'm not sure how to interpret this table.

I would read the table something like this -- if the cutoff were 0.65, the FDR, empirically speaking, would be 0.0001. So you can select it very safely! On the other hand, if the cutoff were 0.35, the FDR would be 20%. If you wanted an FDR of, say 5%, the 0.65 cutoff is too conservative. In this case, I would re-run

updateCutoffs(rho, cutoff = seq(0.35, .65, 0.05))

This will try additional cutoffs from 0.35 to 0.65. One of them may have your target FDR.

(3)

Lastly, I've seen your implementation for multi-omics data sets here and have it working. I have 16s, ITS, and metatranscriptomic data from the same N samples.

Great! I am glad it is useful! FYI we have written a pre-print about the methodological justification for "double CLR" approach here -- https://www.biorxiv.org/content/10.1101/847475v1

``but I'm curious if I can calculate proportionality between environmental variables as well...we also took measurements like pH, nitrogen, latitude, longitude, etc., and I would like to know if any ASVs/transcripts are proportional to these variables.''

Proportionality is useful for finding associations between components belonging to a composition. Variables like nitrogen, latitude, longitude, etc. are not components, but rather real measurements. In this case, I would CLR transform my OTUs manually, and fit a linear model per usual. The pseudo-code might look something like this:

clr <- function(x) log(x) - mean(log(x))
OTUs_as_CLR <- clr(OTUs+1)
lm(pH ~ OTUs_as_CLR)

Or you could popular multivariable ecology methods like RDA from vegan package:

library(vegan)
rda(OTUs_as_CLR, environmental_variables)

@tpq tpq changed the title Difficulty interpretting rho Difficulty interpretting rho and FDR Sep 1, 2020
@tpq tpq added the question label Sep 1, 2020
@tpq tpq changed the title Difficulty interpretting rho and FDR Question about how to interpret rho and FDR Sep 1, 2020
@tpq tpq added the helpful This question has been marked as potentially helpful to others. label Sep 1, 2020
@taylorreiter
Copy link
Author

Thank you! these explanations were extremely helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
helpful This question has been marked as potentially helpful to others. question
Projects
None yet
Development

No branches or pull requests

2 participants