Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorporating SNP data #40

Merged
merged 32 commits into from
Jan 27, 2021
Merged

Incorporating SNP data #40

merged 32 commits into from
Jan 27, 2021

Conversation

rnmitchell
Copy link
Contributor

This PR will be for including SNP data into lusSTR for ForenSeq. There are several steps involved with this:

  • Creating new commands specific to SNPs (something like format_snps and annotate_snps)
  • Formatting SNP data from either UAS Sample Details Report and Phenotype report or STRait Razor data
  • Identifying SNPs reported on the reverse strand in UAS
  • Identifying coordinate of SNP in STRait Razor sequence string (individual SNPs are not reported, only sequence string)
  • Including identity SNPs, ancestry SNPs and phenotype SNPS (have argument to specify which SNPs to include)

@rnmitchell
Copy link
Contributor Author

I changed the original format code to do a quick check that that only STRait Razor .txt files are included in the compiled output file. I thought it might be advantageous to be able to have both the original STRait Razor output files and lusSTR output files all in the same directory (and can run lusSTR multiple times). It will exclude .txt files which throw a ValueError (caused by having final lusSTR output files within the provided directory).

@rnmitchell
Copy link
Contributor Author

I noticed the UAS Sample Details Reports used in the UAS_bulk_upload tests contained sample/analysis/project IDs. I changed the files to positive control (2800M) UAS Sample Detail Reports (and gave generic IDs). Also fixed the format.py script to not duplicate the provided directory name in the code (I have no idea how this worked before??).

@standage
Copy link
Member

I have no idea how this worked before??

Ahh, I see you've reached the final stage of programmer's grief.

  1. That can't happen.
  2. That doesn't happen on my machine.
  3. Wait, that shouldn't happen!
  4. Why does that happen?
  5. Oh, I see...
  6. How did that ever work?!?!

@rnmitchell rnmitchell marked this pull request as ready for review January 26, 2021 13:33
@rnmitchell
Copy link
Contributor Author

@standage this PR is ready for review. Let me know if you think there are additional tests I should include.

I had mentioned in our meeting yesterday that we were getting additional data to test the code with, but I have no idea when that would be, so I think we can move ahead with this and make any changes based on new data later.

Comment on lines +211 to +212
snp_loc = metadata['Coord']
snp_call = seq[snp_loc]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So these two lines are the primary logic for determining the SNP allele, right? Everything else seems to be handling special cases or putative and unexpected indels.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup!

Copy link
Member

@standage standage left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I can't claim to have given every block of code a thorough examination. But the central logic seems pretty straightforward, the CLI is clear and well documented, and the tests pass for me, so thumbs up from me!

@rnmitchell
Copy link
Contributor Author

Can you approve the merge? It' still saying Review required for some reason.

@rnmitchell
Copy link
Contributor Author

Oh... nevermind! I guess my page just had to refresh :)

@rnmitchell rnmitchell merged commit 6c3a74f into master Jan 27, 2021
@standage standage deleted the snps branch January 6, 2023 19:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants