Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Algorithm always returns all 1. #1

Open
BoPeng opened this issue Jun 13, 2020 · 3 comments
Open

Algorithm always returns all 1. #1

BoPeng opened this issue Jun 13, 2020 · 3 comments

Comments

@BoPeng
Copy link

BoPeng commented Jun 13, 2020

It seems that I am missing something really obvious here but whatever I have tried, I get fitness vectors either all NAs or all 1s.

First, could you let me know how you get the data/dataHou18_map0.gv file? I tried the command with parameters from the SCITE website and got graphs that are vastly different from what you get. More importantly, where as you can specify m=18, I had to specify m=17 to avoid errors.

I also tried other datasets, for example a graph that I build to reproduce the tree from CRC2 tumor in Leung et al. Genome Research 2017, but again I get all 1s. I have attached the datamatrix, gene name, and the gv file that I have so that you can reproduce the problem.

Many thanks in advance.

CO8.genename.txt
CO8.master_matrix.scite.txt
CO8.master_matrix_ml0.gv.txt

@vtsyvina
Copy link
Collaborator

Hello, @BoPeng .

We used infSCITE to obtain mutation trees since it is able to add repeated mutation. That is why we have nRep parameter and the tree has 19 instead of 18 mutations inside. I see that it may be confusing, so I changed the README file and file names to be more clear. I also added one more example without repeated mutation and the command we used to get those gv files

As for your example with CO8 patient there was a problem in our code: the case when a mutation doesn't have any attached cells then we don't have an estimation of frequency. So I fixed it by putting a small frequency on such mutations. I ran following command for it:

matlab -nodisplay -nodesktop -r "gv_file='data/CO8.master_matrix_ml0.gv.txt';names_file='data/CO8.genename.txt';n=86;m=26;SCIFIL"

You can see here that I don't use nRep parameter, assuming that each mutation appear only once.

I got the following result:
1.0000 1.0573 1.1545 1.0441 1.0345 1.0234 1.0279 1.0188 1.0062 1.0100 1.0023 1.0143 1.0056 1.0130 1.0216 1.0025 1.0843 1.0158 1.0461 1.0709 1.0314 1.0603 1.0333 1.0491 1.0438 1.0644 1.1013

Let me know if it solves the issue

@BoPeng
Copy link
Author

BoPeng commented Jun 15, 2020

Thanks for your quick response. Your command works for the files I uploaded, but got error on another file. More specifically, I

  1. Used infSCITE to process the same data with command
infSCITE -i CO8.master_matrix.scite  -n 26 -m 86 -r 1 -l 500000 -fd .1 -ad .3 -e .1 -a -seed 225
  1. Ran SCIFIL on the attached .gv file and got
 ✗ matlab -nodisplay -nodesktop -r "gv_file='data/CO8.master_matrix_ml0.gv';n=86;m=26;SCIFIL"
Starting SCIFIL
Index in position 1 is invalid. Array indices must be positive integers or logical values.

Error in scite2Tree (line 12)
    AM(u,v) = 1;

Error in SCIFIL (line 52)
AMscite = scite2Tree(gv_file,n,m); % gv_file = 'data/dataHou18_map0_rep3.gv'

Can you tell what is wrong with the input or option used?

CO8.master_matrix_ml0.gv.infscite.txt

@vtsyvina
Copy link
Collaborator

Sorry for long response.
Yes, I totally forgot about it: we started by using SCITE and then moved to infSCITE, the problem was that they had different output formats(infSCITE outputs labels and start nodes from 0, not 1 and cell are s_1, not s1 like in SCITE). The easiest fix would be to rebuild your version of infSCITE with the modified cpp that we use, so the output will be the same as in SCITE and our script will work fine(I ran it myself).

I know that it is not a good solution to require one specific format, I'll work on fixing it so that SCIFIL will be able to work with any gv file.

output.cpp.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants