Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

P-value and score #184

Open
FabiKeiki opened this issue Mar 3, 2023 · 1 comment
Open

P-value and score #184

FabiKeiki opened this issue Mar 3, 2023 · 1 comment
Assignees

Comments

@FabiKeiki
Copy link

I'm intrigued by the choice made to mix p.value and scores for the phage prediction by contigs. Scores are related to a measure of some kind and p.values are related to the statitistical significance of the score. Also the p.values expressed in the final report are not p.values but 1-p.value. Now we can have a high score with a high p.value (low statistical significance) and a low score with a low p.value (high statistical significance). According to the benchmarking of the various tools (HO et al.), I can trust a score of 0.8 (very small amount of small positive) , I wouldn't however trust a p.value of 0.2 (0.8 in the final report).
So is it not dangerous to mix these informations as they are not related to the same thing? I think virfinder and deepvirfinder also give you a score, maybe put the score instead and eliminate all non statistically significant values.
I think that the way it is now, the normed sum of phage tools is not really usable.

@mult1fractal mult1fractal self-assigned this Mar 3, 2023
@mult1fractal
Copy link
Collaborator

Hey
My initial thought was to collect the information (scores and pvalues) in an overview table. As the articles stated, scores and p values tell the user how likely the contig is a phage. Therefore I put these values in the overview table and explained in tab. 2 what these values are. sum_normed is a value I used to sort the table for the highest likley phages (this value has no meaning). I agree that I need to rename it or also explain it in tab. 2. I changed in tab 2 the virfinder and deepvirfinder to score where it was named wrongly. The actual numbers in the Overview table are the scores and not the p values from both tools. I double checked this also in the code. Thank you for the Hint!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants