Separate output #43

rnmitchell · 2021-07-13T21:15:00Z

This PR provides the option to separate the final output files by sample and output separate files. This is useful when inputting directly into LLAMAS.

…usSTR into readme_update

rnmitchell · 2021-08-19T16:32:14Z

In addition to separating files, this PR changes how lusSTR deals with missing data. Previously, it would drop any allele with 0 reads. However, this data is necessary for EuroForMix... therefore now lusSTR does not drop any allele from the output.

rnmitchell · 2021-08-20T13:47:37Z

This is ready for review @standage. I also plan to release the next version of lusSTR... it's been a minute since I last did that. :)

standage · 2021-08-20T14:14:17Z

lusSTR/snps.py

-    data = uas_load(infile, snp_type_arg)
-    data_filt = data.loc[data['Reads'] != 0].reset_index(drop=True)
+    data_filt = uas_load(infile, snp_type_arg).reset_index(drop=True)


Ok, this is where you're retaining alleles with 0 reads.

standage · 2021-08-20T14:17:20Z

lusSTR/snps.py

+        if data_filt.loc[j, 'Typed Allele?'] == 'No':
+            flag = 'Contains untyped allele'


Does the Typed Allele? column refer to whether there were any reads for that allele?

In any case, space and punctuation in column names can be problematic. If you have just added the column in this PR, I'd recommend using IsTyped instead, and boolean values (True/False) rather than "Yes"/"No" strings.

Or AlleleIsTyped or something.

The Typed Allele? column is from the Sample Details Report... it doesn't necessarily indicate an allele with 0 reads, but an allele with reads below the various thresholds, so can have a low number of reads as well as 0 (i.e. is the allele considered to be a real allele). The Yes/No is read directly from the Sample Details Report, so I'd prefer to leave that as is, but I can change the column name.

The Typed Allele? column is from the Sample Details Report

I see. Maybe worth just leaving it in then...

standage · 2021-08-20T14:18:26Z

lusSTR/snps.py

+    try:
+        os.mkdir(output_dir)
+    except FileExistsError:
+        pass


I'd suggest:

os.makedirs(output_dir, exist_ok=True)

Rebecca Mitchell added 15 commits March 4, 2021 14:21

Updated README and test files with new col names

4466704

Fixed bug with combining reads for sex chr STRs

aff510c

Updated test files and test for combining reads

8c37af5

Updated README

4982a0f

Added code to remove amelogenin sequences in annotate

ab27212

Merge branch 'readme_update' of https://www.github.com/bioforensics/l…

aa14c2f

…usSTR into readme_update

format command can take in single STRaitRazor file

5d54675

Updated cli descriptions

0ab999c

Updated README

a29bad5

Initial commit

e6e03f8

No longer remove SNPs with missing data

48b30ae

resolve conflicts

39a1567

updated annot script

f9c71ec

updated tests to accomodate not removing missing data

907f24e

Updated tests and added test for separating output files

04c9758

rnmitchell requested a review from standage August 20, 2021 13:40

standage reviewed Aug 20, 2021

View reviewed changes

mkdir change

cb7e138

standage approved these changes Aug 20, 2021

View reviewed changes

rnmitchell merged commit 991761c into master Aug 20, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Separate output #43

Separate output #43

rnmitchell commented Jul 13, 2021

rnmitchell commented Aug 19, 2021

rnmitchell commented Aug 20, 2021

standage Aug 20, 2021

standage Aug 20, 2021

standage Aug 20, 2021

rnmitchell Aug 20, 2021

standage Aug 20, 2021

standage Aug 20, 2021

rnmitchell Aug 20, 2021

		if data_filt.loc[j, 'Typed Allele?'] == 'No':
		flag = 'Contains untyped allele'

Separate output #43

Separate output #43

Conversation

rnmitchell commented Jul 13, 2021

rnmitchell commented Aug 19, 2021

rnmitchell commented Aug 20, 2021

standage Aug 20, 2021

Choose a reason for hiding this comment

standage Aug 20, 2021

Choose a reason for hiding this comment

standage Aug 20, 2021

Choose a reason for hiding this comment

rnmitchell Aug 20, 2021

Choose a reason for hiding this comment

standage Aug 20, 2021

Choose a reason for hiding this comment

standage Aug 20, 2021

Choose a reason for hiding this comment

rnmitchell Aug 20, 2021

Choose a reason for hiding this comment