You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for checking out the repo. It looks like you have successfully run phenopy on your input files, that's great! The behavior you describe is expected. It's a property of the HRSS semantic similarity scoring algorithm. It's a way to scale similarity scores by rewarding nodes being compared further down the ontology. The way the algorithm is implemented here, even a phenotype-to-itself is only ever 1.0 by HRSS when the beta_ic is 0.0. This is the case in leaf nodes. Does this explanation help?
so how would i set a network-cutoff value then, if same terms might not result in 1.0? Also is there any possibility to introduce my own scores, if I have some frequency values attached to Phenotypes?
Dear Kevin,
I would like to calculate the similarity for a few genes (~2000).
I annotated these genes with the HPO codes from the human phenotype ontology webpage (http://compbio.charite.de/jenkins/job/hpo.annotations/lastSuccessfulBuild/artifact/util/annotation/genes_to_phenotype.txt).
I obtained reshaped and got a file like this:
which I think is the correct format for phenopy. I then used the command:
and I got as output something like this:
However, the identity for some genes are not 1 as I was expecting. For instance:
Would you expect something like this? How would you explain it?
Should I use a different --summarization-method ?
Best regards,
Luca
The text was updated successfully, but these errors were encountered: