-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ConvperPos isn't working #4
Comments
Thanks for getting in touch. |
10 lines from annotated_sorted_bam file |
You do not have an ST tag in the bam file that you sent. If the gene IDs in the gtf do not match the gene IDs in the strandedness.csv file, this would be a problem. Can you confirm that these are the same gene IDs? |
I see. I saw there is CreateStrandinfo.py in scripts folder but it isn't working for our gtf file though |
Yes, you should make a new strandedness.csv file with the same format. I will update this in the next few days to be done automatically from the provided gtf file. Thanks for the feedback! |
Hi, I created new strandness.csv file for my gtf file based on your example strandness.csv file I attach my strandness file. Thanks. |
Exactly what parts of the analysis did you rerun (i.e. which flags?). Could you again send me the top ~100 lines lines of this new bam file that still fails ConvperPos? |
I rerun entire process from the start Here is my annotated_sorted_bam file |
Did you try to run ConvperPos? Sorry for the confusion, but the ST tag is actually added in the first step by ConvperPos. It then proceeds to annotate the conversions. The problem you had before was the non-matching strandedness IDs and gene IDs, not necessarily the lack of the ST tag directly. After this step, you should see and ST tag, as well as the conversion tags in the header. |
Okay, I will try ConvperPos.py directly to my bam file, than I will tell the result. Thanks |
How did it go? Did you manage to add the conversion tags to the bam file in the end? |
Yes, It was my fault that I make strandness.csv as 'Tab' separated not 'Comma' separated. During the tagging process, It printed the message says: /ssd-data/workspace/support/tool/anaconda3/envs/python2.7/lib/python2.7/site-packages/pandas/core/frame.py:6692: FutureWarning: Sorting because non-concatenation axis is not aligned. A future version of pandas will change to not sort by default. To accept the future behavior, pass 'sort=False'. To retain the current behavior and silence the warning, pass 'sort=True'. sort=sort) I think it is just warning message that can ignore but I will let you know. After tagging, I run vcfFilter, however there is an error says: Error in seq.default(min(x, na.rm = na.rm), max(x, na.rm = na.rm), length = breaks) : And no files are created Is there anything that I should check? Thanks for getting in touch. SRR8724279_Aligned.sortedByCoord.out.bam_removeDupl.bam.featureCounts.bam_PosTag.csv.txt |
Do I understand correctly that you are running through this with only a single file? The VCF filter step checks how often a certain conversion occurs over different cells and reads. It is important to note that this requires multiple cells to be compared. If you want to run this on a single cell, you can use './data/posfile.csv' to run it through using the SNPs that we detected in Jurkat cells. Alternatively, I would suggest running this with a few (or all) cells. You can then remove some of the cells from the next steps that take more computing time... Also good to note, this step outputs a single pdf file, which shows the top converted positions and the top detected positions (sorted in that order). The next step will then actually filter the conversion tags in the headers for these positions. |
I am running with 2 files for testing(one is stimulated, the other is not). You mean that the error may occur when I try to run with just 1 file? |
Did you solve this issue in the end or are you still stuck at this step? |
Hi, I setup NASC-seq analysis pipeline in our lab's ubuntu system
I setup config.py with gencode.v31.primary_assembly.annotation.gtf
strandedness.csv in NASC-seq/data folder
NASCseqModel.stan in NASC-seq/data folder
I use your data from GSE128273, which is your data from your paper
It worked well until annotate step(annotated_sorted_bam is created), however, conversion tag step yield empty bam file and empty Postag.csv except header. So, I can't go to next step.
I think addTags function in ConvperPos.py caused this problem
this line :
read.set_tag('ST',strandedness.loc[read.get_tag('XT')][1])
How can I solve this problem?
Thanks.
The text was updated successfully, but these errors were encountered: