Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue running MANTIS #42

Open
chloe35430 opened this issue Mar 28, 2019 · 12 comments
Open

Issue running MANTIS #42

chloe35430 opened this issue Mar 28, 2019 · 12 comments
Assignees

Comments

@chloe35430
Copy link

Hello,

I have an issue while running MANTIS on exome datasets.
I wrote the following code lines :

python2.7 mantis.py -b ../../Test_MANTIS/genome_scan.bed --genome ../../Test_MSISensor/IndexedGenome/hg19.fa -n ../../Test_MANTIS/constit.cleaned.bam -t ../../Test_MANTIS/somatic.cleaned.bam -o ../../Test_MANTIS/resultats_mantis_hg19.txt -mrq 20.0 -mlq 25.0 -mlc 20 -mrr 1

And I got these printed on my terminal:

Microsatellite Analysis for Normal-Tumor InStability (v1.0.4)
python /Users/chloe/work/MANTIS/kmer_repeat_counter.py
-b /Users/chloe/Test_MANTIS/genome_scan.bed
-n /Users/chloe/Test_MANTIS/albert_constit.cleaned.bam
-t /Users/chloe/Test_MANTIS/albert_somatic.cleaned.bam
-o /Users/chloe/Test_MANTIS/resultats_mantis_albert_hg19.kmer_counts.txt
--min-read-quality 20.0
--min-locus-quality 25.0
--min-read-length 35
--genome /Users/chloe/Test_MSISensor/IndexedGenome/hg19.fa
--threads 1
Getting repeat counts for repeat units (k-mers) ...

Then, after 48h running, I got this error that I don't understand :

Traceback (most recent call last):
File "/Users/chloe/work/MANTIS/kmer_repeat_counter.py", line 866, in
normal = krc.process(config['normal_filepath'], msi_loci, config)
File "/Users/chloe/work/MANTIS/kmer_repeat_counter.py", line 614, in process
self.status_check(queue_out.qsize())
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/queues.py", line 143, in qsize
return self._maxsize - self._sem._semlock._get_value()
NotImplementedError

I don't get where this error comes from. Could you help me please?

Thanks,
Chloé

@rbonneville
Copy link
Contributor

Hello @chloe35430, could you please share your genome BED file?

@rbonneville rbonneville self-assigned this Mar 28, 2019
@chloe35430
Copy link
Author

chloe35430 commented Mar 29, 2019

Hello, @rbonneville, thanks for your quick answer !

It is a whole genome bed file, it weighs 1.15 Go and the .bed is not supported.

Here is the header of the file:
chr1 10485 10498 (GCCC)3 0 +
chr1 10629 10635 (GC)3 0 +
chr1 10652 10658 (AG)3 0 +
chr1 10658 10664 (GC)3 0 +
chr1 10681 10687 (AG)3 0 +
chr1 10687 10693 (GC)3 0 +
chr1 10710 10716 (AG)3 0 +
chr1 10716 10722 (GC)3 0 +
chr1 10739 10745 (AG)3 0 +
chr1 10745 10751 (GC)3 0 +

I used Repeat Finder to create it and it seemed to go without any issue...
Let me know if you need anything else.

Thanks for the help !

Chloé

@rbonneville
Copy link
Contributor

@chloe35430, MANTIS is crashing because of this extremely large number of microsatellite loci. RepeatFinder generates a .bed file of all putative microsatellites in the supplied genome. This should be filtered to only use microsatellites within your capture region or other region of interest.

@rbonneville rbonneville removed the bug label Apr 4, 2019
@chloe35430
Copy link
Author

@rbonneville , sorry for the late answer.
I filtered the file generated by RepeatFinder to get only the loci sequenced in the experiment. My bedfile weighs now 25 ko, but I still get the same error "NotImplemented Error", just much faster than before.

Do you have any idea on how to solve this ?

Thanks for your help,

@audyavar
Copy link

audyavar commented Apr 18, 2019

For exomeSeq data, how do you go about filtering the BED File generated by RepeatFinder? I have paired end exomeseq data from primary tumors. Do you need to filter the loci for exomeseq data?

@rbonneville
Copy link
Contributor

@chloe35430 how many lines are in your BED file?

@rbonneville
Copy link
Contributor

@audyavar The BED file from RepeatFinder should be filtered for loci included in your capture region. This can be done easily with bedtools, for example:

bedtools intersect -a microsatellites.bed -b regions.bed -wa > microsatellites_filtered.bed

@audyavar
Copy link

audyavar commented Apr 18, 2019 via email

@chloe35430
Copy link
Author

@rbonneville My bedfile counts 828 lines. Is it still too many lines ?

@rbonneville
Copy link
Contributor

@chloe35430 That is definitely not too many lines. I believe you are encountering a Mac-specific issue with Python multithreading, in particular because MANTIS uses semaphores which are not properly implemented in OS X.

In kmer_repeat_counter.py, try replacing BoundedSemaphore with Semaphore. This will break thread safety, but since you're only using one thread anyway it should be less of an issue.

@rbonneville
Copy link
Contributor

@audyavar Your whole exome capture kit should have come with a list of target regions, likely BED formatted. It is necessary to filter for the exome because RepeatFinder outputs putative microsatellites throughout the entire genome.

@chloe35430
Copy link
Author

@rbonneville I just tried the modification you suggested, it still doesn't work and produces the same error. Do you have any other idea to solve this ?
I am going to try running it under a linux system, to see if I get the same issue, but I need to get it to work on Mac too...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants