You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
absolutely phenomenal tool set, I'm really excited by the possibility to run DE testing on the output of sctransform! I was wondering if you could comment on what a sufficient minimal number of cells would ideally be for "robust" DE calling between two groups when working with the implementation of diff_mean_test()?
The text was updated successfully, but these errors were encountered:
That's a good question and I've always wanted to formally test this. More cells is always better, but a more nuanced answer takes into account two more aspects
The expression level of the gene, i.e. mean UMI counts in group 1
The fold change, i.e. log2(mean_in_group1/mean_in_group2)
I've run simulations to get a better idea of how exactly these two factors affect the number of cells needed. This figure sums up the results:
For example, to detect a log2FC of 2 for a gene with mean 0.1 (so going from 0.1 to 0.4, bottom row, third panel), you would need about 100 cells per group. A decrease (negative log2FC, panel above) would be much harder to detect (ca. 80% recovery with 200 cells per group when going from 0.1 to 0.025)
On the other hand, if a gene is absent from one group and then goes up to medium-high (say from 0.001 to 1) even 20 cells will be sufficient.
Notes regarding these results
No p-value correction for multiple testing
Not looking at false positives - this is not telling us anything about specificity overall
Assuming balanced group sizes
Assuming corrected counts are used, i.e. no additional variability in the counts due to sequencing depth
Hi Christoph,
absolutely phenomenal tool set, I'm really excited by the possibility to run DE testing on the output of sctransform! I was wondering if you could comment on what a sufficient minimal number of cells would ideally be for "robust" DE calling between two groups when working with the implementation of diff_mean_test()?
The text was updated successfully, but these errors were encountered: