Skip to content

Commit

Permalink
feat: add --only-passing option to statSTR (#229)
Browse files Browse the repository at this point in the history
Co-authored-by: Arya Massarat <[email protected]>
  • Loading branch information
Buuxx and aryarm authored Nov 8, 2024
1 parent ce22d84 commit 51c0481
Show file tree
Hide file tree
Showing 4 changed files with 22 additions and 3 deletions.
6 changes: 3 additions & 3 deletions trtools/prancSTR/README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -45,12 +45,12 @@ Other general parameters:
* :code:`--region <string>`: Restrict to the region chr:start-end. VCF file must be bgzipped and indexed to use this option.
* :code:`--samples <string>`: Restrict to the given list of samples. Samples are comma separated.
* :code:`--vcftype <string>`: Specify the tool which generated the vcf call file for STRs. Currently this will fail if using anything other than :code:`hipstr` VCFs.
* :code:`--only-passing`: Filters out the VCF records with non-passing FILTER column
* :code:`--output-all`: Force tool to output results for all loci. Overrides :code:``--only-passing``.
* :code:`--only-passing`: Filters out the VCF records with non-passing FILTER column.
* :code:`--output-all`: Force tool to output results for all loci. Overrides :code:`--only-passing`.
* :code:`--readfield <string>`: Specify which VCF format field output by HipSTR to utilize for extracting read information. We recommend setting this to "MALLREADS". "ALLREADS" is also accepted but we have found that it produces unreliable results.
* :code:`--debug`: Print helpful debug messages.
* :code:`--quiet`: Restrict printing of any messages.
* :code:`--version`: Print the version of the tool
* :code:`--version`: Print the version of the tool.

Notes:

Expand Down
1 change: 1 addition & 0 deletions trtools/statSTR/README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ Optional general parameters:
* :code:`--sample-prefixes <string>`: The prefixes to name output for each samples group. By default uses 1, 2, 3 etc. Must be sample length as :code:`--samples`.
* :code:`--region <string>`: Restrict to specific regions (chrom:start-end). Requires the input VCF to be bgzipped and tabix indexed.
* :code:`--precision <int>`: How much precision to use when writing stats (default = 3)
* :code:`--only-passing`: Filters out the VCF records with non-passing FILTER column.

For specific statistics available, see below.

Expand Down
6 changes: 6 additions & 0 deletions trtools/statSTR/statSTR.py
Original file line number Diff line number Diff line change
Expand Up @@ -453,6 +453,8 @@ def getargs(): # pragma: no cover
filter_group.add_argument("--region", help="Restrict to the region "
"chrom:start-end. Requires file to bgzipped and"
" tabix indexed.", type=str)
filter_group.add_argument("--only-passing", help="Only process records "
" where FILTER==PASS", action="store_true")
stat_group_name = "Stats group"
stat_group = parser.add_argument_group(stat_group_name)
stat_group.add_argument("--thresh", help="Output threshold field (max allele size, used for GangSTR strinfo).", action="store_true")
Expand Down Expand Up @@ -574,6 +576,10 @@ def main(args):
nrecords += 1

trrecord = trh.HarmonizeRecord(vcftype, record)

if args.only_passing and record.FILTER is not None:
continue

if args.plot_afreq and num_plotted <= MAXPLOTS:
PlotAlleleFreqs(trrecord, args.out, sample_indexes=sample_indexes, sampleprefixes=sample_prefixes)
num_plotted += 1
Expand Down
12 changes: 12 additions & 0 deletions trtools/statSTR/tests/test_statSTR.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ def args(tmpdir):
args.sample_prefixes = None
args.plot_afreq = False
args.region = None
args.only_passing = False
args.thresh = False
args.afreq = False
args.acount = False
Expand Down Expand Up @@ -49,6 +50,17 @@ def test_RightFile(args, vcfdir):
retcode = main(args)
assert retcode==0

# Test rhe only passing option
def test_OnlyPassing(args, vcfdir):
fname = os.path.join(vcfdir, "CEU_test.vcf.gz")
args.vcf = fname

# With only passing
args.only_passing = True
args.region = None
retcode = main(args)
assert retcode==0

# Test all the statSTR options
def test_Stats(args, vcfdir, capsys):
fname = os.path.join(vcfdir, "few_samples_few_loci.vcf.gz")
Expand Down

0 comments on commit 51c0481

Please sign in to comment.