-
Notifications
You must be signed in to change notification settings - Fork 166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
problem #496
Comments
Hi @Hananebh ,
I think the dendrogram you are seeing in the new version is due to each
cell being assigned its own subcluster by the Leiden clustering, but I
would need to know what options you used in both runs to be confident.
The older version of infercnv used a different library for the Leiden
clustering that has stopped being maintained since because it has been
improved to work natively in R rather than require Python, as well as add
more options. One of the options that has changed with the library is that
the default scoring function changed from "modularity" to "CPM", which is
theoretically an improvment, however the optimal 'resolution' parameter
changed with it. A good start is to use 0.05 for resolution with "CPM" when
you used 1 for resolution with "modularity" (only mode available in 1.9),
then tweak the number as needed. Alternatively, you can change the scoring
"leiden_function" back to the old "modularity" one. There might still be
differences because of the different implementation, and because we now
also run a PCA ("leiden_method" argument, old was the "simple" option)
first.
Regards,
Christophe.
…On Fri, Jan 13, 2023, 04:21 Hananebh ***@***.***> wrote:
Hello,
I work with infercnv version 1.14 and I had a problem with the dendrogram
and the final result which is completely different from the one I obtained
with infercnv version 1.9
[image: Capture d’écran 2023-01-13 à 10 16 51]
<https://user-images.githubusercontent.com/55884781/212283332-dc7eeefd-96d5-4314-83de-b6cd7da61683.png>
I would like to know what difference there is between the two versions and
why I have a problem with the new dendrogram?
Your help is highly appreciated! Thank you.
—
Reply to this email directly, view it on GitHub
<#496>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADC5EC5E35ECZN3ICNEJKFTWSENALANCNFSM6AAAAAAT2F3P34>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Hi @GeorgescuC ,
|
Hi @Hananebh, I have the same issue with the last version of inferCNV (1.14). I think you could try the following parameters to get something similar to how the clustering was previously done in 1.9: infercnv::run(
infercnv_obj,
[…]
k_nn=30, # 1.9.1 default param
leiden_resolution = 1, # 1.9.1 default param
leiden_method = "simple", # 1.9.1 default param
leiden_function = "modularity", # 1.9.1 default param
[…]
) The results are not exactly the same, but at least they are rather similar. I initially tried to reduce the resolution parameter with the PCA+CPM new-default approach as suggested by @GeorgescuC, but I need to use reaaaaaally low values (down to 0.0005) to have at least some groups that are not singletons, and the original subgroup annotations are poorly clustered, so I don’t think that’s the way to go. Cheers, |
The settings that @nigiord posted should indeed give the closest results to previous versions of infercnv that used the python implementation of the Leiden algorithm instead of igraph's. There is one more setting that was added in 1.14 that can affect the subclustering and you might want to change, which is the masking of genes that have a z score over (by default) 0.8 in references, and is controlled by the z_score_filter option. This masking is done to ignore genes that show strong signal in references as is common for MHC genes in chromosome 6 for example. Looking at your plot, it might however mask more genes than expected due to the cluster of cells at the top of references that look different than the rest and have an residual expression pattern very similar to your observations. Regards, |
hi @GeorgescuC Best, |
Hi @deevdevil88 , The z score filtering does not affect in any way the residual expression values show in the figures. The z score only masks genes when calculating the nearest neighbors for cells (either directly on the residual expression, or on a PCA of the residual expression) at the start of the Leiden clustering, but not anything else. The downstream effect that can be visible is for the HMM predictions, as that uses subclusters to combine information from clonal populations of cells, so inaccurate/overly fragmented subclusters would reduce HMM prediction accuracy. In the figure you shared, there are 2 things I notice:
I cannot tell from this if the cells you used as reference are actually healthy or not as infercnv is an analysis relative to what is provided as references, but it is worth looking more into the accuracy of the clusterings you defined for your references. Regards, |
hi @GeorgescuC , Best, |
Hi @deevdevil88 , One thing to keep in mind is that the normal baseline expressions are defined as the average expression in each reference. This means that having signal show up in your references does not necessarily mean that some of those cells are malignant/tumor. It simply means there is heterogeneity within the groups of cells you have defined as references. Conversely, if you define a homogenous group of tumor cells as references (a clonal expansion for example), they will not display any signal since they are considered one of the baseline expression levels, and normal cells used as observations might show signal that is the opposite of the event that happened in the tumor as the difference is relative. A very common example of signal showing up in healthy references is MHC genes on chromosome 6 for immune cells: usually about half the cells show a loss signal while the other half show a gain signal. This observation was also the basis for adding the z score masking option during subclustering. Regards, |
Hello,
I work with infercnv version 1.14 and I had a problem with the dendrogram and the final result which is completely different from the one I obtained with infercnv version 1.9
I would like to know what difference there is between the two versions and why I have a problem with the new dendrogram?
Your help is highly appreciated! Thank you.
The text was updated successfully, but these errors were encountered: