You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've noticed that when I run STAR with -outReadsUnmapped Fastx on Illumina reads that rather than just appending the mapping status of the read and the mate, the final field of the original Illumina header is modified in a way that removes the index sequence. It would be very nice to retain the index sequence in the headers of the unmapped reads.
Example of a read in the raw data and in the Unmapped file:
raw data:
@NS500540:129:HKJG2BGX7:4:13402:14458:19861 1:N:0:CGTAAG
unmapped file:
@NS500540:129:HKJG2BGX7:4:13402:14458:19861 0:N: 00
I have only tested this with STAR 2.7.9a but I didn't see anything about this issue in the changelogs of subsequent releases or in the issue tracker.
Why this would be useful: I ran many samples through STAR at the same time, which were only disambiguated by the index sequence. If I had access to the index sequence, I could easily identify which sample the unmapped read came from; right now I can only pinpoint which sequencing run it was from based on the name conventions in Illumina FASTQ headers.
The text was updated successfully, but these errors were encountered:
ENHANCEMENT REQUEST:
I've noticed that when I run STAR with
-outReadsUnmapped Fastx
on Illumina reads that rather than just appending the mapping status of the read and the mate, the final field of the original Illumina header is modified in a way that removes the index sequence. It would be very nice to retain the index sequence in the headers of the unmapped reads.Example of a read in the raw data and in the Unmapped file:
I have only tested this with STAR 2.7.9a but I didn't see anything about this issue in the changelogs of subsequent releases or in the issue tracker.
Why this would be useful: I ran many samples through STAR at the same time, which were only disambiguated by the index sequence. If I had access to the index sequence, I could easily identify which sample the unmapped read came from; right now I can only pinpoint which sequencing run it was from based on the name conventions in Illumina FASTQ headers.
The text was updated successfully, but these errors were encountered: