Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Determining "CNA Signal" and "CNA Correlation" from inferCNV output files #439

Open
pshukla99 opened this issue Jul 22, 2022 · 6 comments
Open

Comments

@pshukla99
Copy link

Hi all,

I'm trying to follow the analysis in this paper (https://www.sciencedirect.com/science/article/pii/S0092867419306877?via%3Dihub#figs1) to quantitatively classify the cells in my sample as malignant vs non-malignant. I'm interested in computing "CNA Signal" which is defined as "mean of the squares of CNA values across the genome" and the "CNA Correlation" which is defined as "the correlation between the CNA profile of each cell and the average CNA profile of all cells from the corresponding tumor."

Which output files should I be looking at to find these CNA values? Also, are there any other ways to assign labels of malignant vs non-malignant to cells in a sample that are more quantitative than visual inspection of the final inferCNV heatmap? Thanks in advance.
Screen Shot 2022-07-21 at 7 49 34 PM

@sunshine1126
Copy link

sunshine1126 commented Jul 26, 2022

I'm also interested in this. I hope there's a more quantitative way to identify malignant or non-malignant cells in a sample based on CNV.

@GeorgescuC
Copy link
Collaborator

Hi @pshulka99 , Hi @sunshine1126 ,

If you want to do the analysis in R, you can use the infercnv object you have at the end of the analysis (which can be loaded again with infercnv_obj = readRDS("run.final.infercnv_obj)"). The residual expression values found in the [email protected] slot can be used to calculate the mean of square across the genome. Alternatively, you can read the text file matrices output with each plot.

For identification of confident CNAs, you can run the HMM of infercnv that will define specific boundaries for CNAs and the specific fold change. A Bayesian network is also used for filtering based on posterior probabilities. You can find more details about the HMM on the wiki.

Regards,
Christophe.

@sunshine1126
Copy link

@GeorgescuC Thanks for your reply, and I will try it.

@gloriafight
Copy link

@GeorgescuC Thanks for your reply, and I will try it.

Do you solve this question? When I calculate the mean square for each cell based on the infercnv.observations.txt, the result is as follow. However, the cnv score is very low.
image

@Lualululu
Copy link

@GeorgescuC Thanks for your reply, and I will try it.

Do you solve this question? When I calculate the mean square for each cell based on the infercnv.observations.txt, the result is as follow. However, the cnv score is very low. image

I got the same results as yours.
If I calculate the mean of the squares of CNA values across the genome as the cnv signals, the results are 0.00X-0.00X.
If I calculate the standard deviation of CNA values across the genome, the results are 0.0X-0.0X.
The method using the standard deviation calculation is closer to the results of the paper. But no matter which method is used, tumor cells and non-malignant cells are still relatively divided, because tumor cells both have a higher cnv signal and cnv correlation.

The paper didn't specify the threshold selection of cnv signal and cnv correlation, which is what I am curious about, whether it is possible to divide most tumor cells from non-malignant cells. For your data, it is feasible to choose 0.003 and 0.45 as the cnv signal and cnv correlation threshold.

Not sure if my idea is reasonable, looking forward to your reply!

@pikapika505
Copy link

Hi @pshulka99 , Hi @sunshine1126 ,

Did you find out how to compute CNA signal and CNA correlation? I am trying to reproduce another cancer data analysis which also includes CNA signal and CNA correlation.
Based on @GeorgescuC answer, I guess I can find CNA signal as mean of squares of expr.data for each cell across the genes. But what about CNA correlation?

@gloriafight @Lualululu how did you calculate CNA correlation?

thank you,
Yulia

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants