Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.paf file for chimera checking #381

Closed
a-89 opened this issue Apr 13, 2019 · 3 comments
Closed

.paf file for chimera checking #381

a-89 opened this issue Apr 13, 2019 · 3 comments

Comments

@a-89
Copy link

a-89 commented Apr 13, 2019

Dear Heng,

I am using Minimap2 with full-length 16S rRNA ONT reads in a mock community. As seen, in other issues using -c or not changed my .paf results. I am using this -paf file to further remove chimera (with yacrd).

I just wanted to be sure, that -c option gives more accurate results since without -c option they appear much more chimeric reads than when using -c. When splitting these "chimeric" reads and blasting them, they seem chimeric. But maybe is due to the high similarity of the 16S rRNA gene (I attached two sequences as an example: chimera_or_not.txt).

Minimap2 output:
##Without -c option, the read 3984e626-4c77-4461-99d2-dcac7d389900 will be considered chimeric

3984e626-4c77-4461-99d2-dcac7d389900 1407 654 1401 - Lactobacillus_fermentum_complete_genome 1905333 1889994 1890758 434 767 0 tp:A:P cm:i:51 s1:i:432 s2:i:432 dv:f:0.0639          
3984e626-4c77-4461-99d2-dcac7d389900 1407 654 1401 + Lactobacillus_fermentum_complete_genome 1905333 1229551 1230315 434 767 0 tp:A:S cm:i:51 s1:i:432 dv:f:0.0639            
3984e626-4c77-4461-99d2-dcac7d389900 1407 654 1401 - Lactobacillus_fermentum_complete_genome 1905333 20179 20943 434 767 0 tp:A:S cm:i:51 s1:i:432 dv:f:0.0639            
3984e626-4c77-4461-99d2-dcac7d389900 1407 654 1401 - Lactobacillus_fermentum_complete_genome 1905333 411724 412488 434 767 0 tp:A:S cm:i:51 s1:i:432 dv:f:0.0639            
3984e626-4c77-4461-99d2-dcac7d389900 1407 654 1401 - Lactobacillus_fermentum_complete_genome 1905333 97074 97838 434 767 0 tp:A:S cm:i:51 s1:i:432 dv:f:0.0639            
3984e626-4c77-4461-99d2-dcac7d389900 1407 20 466 - Staphylococcus_aureus_chromosome 2718780 728964 729429 158 466 11 tp:A:P cm:i:16 s1:i:155 s2:i:141 dv:f:0.0935          
3984e626-4c77-4461-99d2-dcac7d389900 1407 20 466 + Staphylococcus_aureus_chromosome 2718780 2091147 2091612 144 466 0 tp:A:S cm:i:14 s1:i:141 dv:f:0.1024            

##However, with the -c option this read will be considered okay

3984e626-4c77-4461-99d2-dcac7d389900 1407 5 1404 + Lactobacillus_fermentum_complete_genome 1905333 1228882 1230318 1279 1463 0 NM:i:184 ms:i:1768 AS:i:1768 nn:i:0 tp:A:P cm:i:51 s1:i:432 s2:i:432 dv:f:0.0639 cg:Z:40M2D5M1I27M2D36M4D15M1D28M1D44M1D4M1D9M1I10M3I3M1D38M1D31M1D3M1I3M1D11M1D13M2D3M2I75M1I2M2D6M2D2M5D14M2I6M1D3M1I34M2D30M1I54M1D4M1D150M1I45M1D62M2I3M1D2M2D42M4D22M1D1M1D59M1I13M2D33M2D9M2D54M1I6M2D14M1D15M1I7M1I57M1D24M1I8M1D13M2D4M1D4M2I15M2D1M1I6M2I40M1D5M3D17M1I55M1D33M
3984e626-4c77-4461-99d2-dcac7d389900 1407 5 1404 - Lactobacillus_fermentum_complete_genome 1905333 20176 21612 1279 1466 0 NM:i:187 ms:i:1768 AS:i:1768 nn:i:0 tp:A:S cm:i:51 s1:i:432 dv:f:0.0639 cg:Z:32M1D55M1I16M3D3M1D43M5I3M2D3M2D11M2I7M1D3M2D14M1D8M1I22M1D57M1I9M1I16M1D14M2D3M1I56M2D10M2D30M2D14M1I60M1D2M1D19M4D44M2D2M3I3M2D62M1D43M1I151M1D5M1D51M1I33M2D33M1I4M1D5M2I14M5D3M2D5M2D3M1I75M2I3M2D12M1D12M1D2M1I3M1D29M1D41M1D3M3I8M1I11M1D2M1D46M1D27M1D15M4D37M2D25M1I3M2D44M
3984e626-4c77-4461-99d2-dcac7d389900 1407 5 1404 - Lactobacillus_fermentum_complete_genome 1905333 97071 98507 1279 1466 0 NM:i:187 ms:i:1768 AS:i:1768 nn:i:0 tp:A:S cm:i:51 s1:i:432 dv:f:0.0639 cg:Z:32M1D55M1I16M3D3M1D43M5I3M2D3M2D11M2I7M1D3M2D14M1D8M1I22M1D57M1I9M1I16M1D14M2D3M1I56M2D10M2D30M2D14M1I60M1D2M1D19M4D44M2D2M3I3M2D62M1D43M1I151M1D5M1D51M1I33M2D33M1I4M1D5M2I14M5D3M2D5M2D3M1I75M2I3M2D12M1D12M1D2M1I3M1D29M1D41M1D3M3I8M1I11M1D2M1D46M1D27M1D15M4D37M2D25M1I3M2D44M
3984e626-4c77-4461-99d2-dcac7d389900 1407 5 1404 - Lactobacillus_fermentum_complete_genome 1905333 1889991 1891427 1278 1466 0 NM:i:188 ms:i:1762 AS:i:1762 nn:i:0 tp:A:S cm:i:51 s1:i:432 dv:f:0.0639 cg:Z:31M1D56M1I16M3D3M1D43M5I3M2D3M2D11M2I7M1D3M2D14M1D8M1I22M1D57M1I9M1I16M1D14M2D3M1I56M2D10M2D30M2D14M1I60M1D2M1D19M4D44M2D2M3I3M2D62M1D43M1I151M1D5M1D51M1I33M2D33M1I4M1D5M2I14M5D3M2D5M2D3M1I75M2I3M2D12M1D12M1D2M1I3M1D29M1D41M1D3M3I8M1I11M1D2M1D46M1D27M1D15M4D37M2D25M1I3M2D44M
3984e626-4c77-4461-99d2-dcac7d389900 1407 5 1404 - Lactobacillus_fermentum_complete_genome 1905333 411721 413167 1281 1476 0 NM:i:195 ms:i:1752 AS:i:1752 nn:i:0 tp:A:S cm:i:51 s1:i:432 dv:f:0.0639 cg:Z:32M1D55M1I16M3D3M1D43M5I3M2D3M2D11M2I7M1D3M2D14M1D8M1I22M1D57M1I9M1I16M1D14M2D3M1I56M2D10M2D30M2D14M1I60M1D2M1D19M4D44M2D2M3I3M2D62M1D43M1I151M1D5M1D51M1I33M2D33M1I4M1D5M2I14M5D3M2D5M2D3M1I75M2I3M2D12M1D12M1D2M1I3M1D29M1D41M1D3M3I8M1I11M1D2M1D46M1D27M1D15M4D6M4D3M6D28M2D25M1I3M2D44M
@lh3
Copy link
Owner

lh3 commented Apr 13, 2019

-c is better.

@lh3 lh3 closed this as completed Apr 13, 2019
@a-89
Copy link
Author

a-89 commented Apr 15, 2019

Thanks Heng! Sorry for the duplicated issue!

I was wondering if changing some other parameters would allow identifying more chimeric reads. I am working with 1,500 bp reads that are highly similar (16S rRNA) and I have seen through BLAST more chimeric reads that the ones reported by default settings...

Would it make sense to lower -g parameter? Any other suggestions?

Thanks again!

@lh3
Copy link
Owner

lh3 commented Apr 15, 2019

Reduce -z.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants