Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add --only-passing option to statSTR #229

Merged
merged 4 commits into from
Nov 8, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions trtools/prancSTR/README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -45,12 +45,12 @@ Other general parameters:
* :code:`--region <string>`: Restrict to the region chr:start-end. VCF file must be bgzipped and indexed to use this option.
* :code:`--samples <string>`: Restrict to the given list of samples. Samples are comma separated.
* :code:`--vcftype <string>`: Specify the tool which generated the vcf call file for STRs. Currently this will fail if using anything other than :code:`hipstr` VCFs.
* :code:`--only-passing`: Filters out the VCF records with non-passing FILTER column
* :code:`--output-all`: Force tool to output results for all loci. Overrides :code:``--only-passing``.
* :code:`--only-passing`: Filters out the VCF records with non-passing FILTER column.
* :code:`--output-all`: Force tool to output results for all loci. Overrides :code:`--only-passing`.
* :code:`--readfield <string>`: Specify which VCF format field output by HipSTR to utilize for extracting read information. We recommend setting this to "MALLREADS". "ALLREADS" is also accepted but we have found that it produces unreliable results.
* :code:`--debug`: Print helpful debug messages.
* :code:`--quiet`: Restrict printing of any messages.
* :code:`--version`: Print the version of the tool
* :code:`--version`: Print the version of the tool.

Notes:

Expand Down
1 change: 1 addition & 0 deletions trtools/statSTR/README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ Optional general parameters:
* :code:`--sample-prefixes <string>`: The prefixes to name output for each samples group. By default uses 1, 2, 3 etc. Must be sample length as :code:`--samples`.
* :code:`--region <string>`: Restrict to specific regions (chrom:start-end). Requires the input VCF to be bgzipped and tabix indexed.
* :code:`--precision <int>`: How much precision to use when writing stats (default = 3)
* :code:`--only-passing`: Filters out the VCF records with non-passing FILTER column.

For specific statistics available, see below.

Expand Down
6 changes: 6 additions & 0 deletions trtools/statSTR/statSTR.py
Original file line number Diff line number Diff line change
Expand Up @@ -453,6 +453,8 @@ def getargs(): # pragma: no cover
filter_group.add_argument("--region", help="Restrict to the region "
"chrom:start-end. Requires file to bgzipped and"
" tabix indexed.", type=str)
filter_group.add_argument("--only-passing", help="Only process records "
" where FILTER==PASS", action="store_true")
stat_group_name = "Stats group"
stat_group = parser.add_argument_group(stat_group_name)
stat_group.add_argument("--thresh", help="Output threshold field (max allele size, used for GangSTR strinfo).", action="store_true")
Expand Down Expand Up @@ -574,6 +576,10 @@ def main(args):
nrecords += 1

trrecord = trh.HarmonizeRecord(vcftype, record)

if args.only_passing and record.FILTER is not None:
continue

if args.plot_afreq and num_plotted <= MAXPLOTS:
PlotAlleleFreqs(trrecord, args.out, sample_indexes=sample_indexes, sampleprefixes=sample_prefixes)
num_plotted += 1
Expand Down
12 changes: 12 additions & 0 deletions trtools/statSTR/tests/test_statSTR.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ def args(tmpdir):
args.sample_prefixes = None
args.plot_afreq = False
args.region = None
args.only_passing = False
args.thresh = False
args.afreq = False
args.acount = False
Expand Down Expand Up @@ -49,6 +50,17 @@ def test_RightFile(args, vcfdir):
retcode = main(args)
assert retcode==0

# Test rhe only passing option
def test_OnlyPassing(args, vcfdir):
fname = os.path.join(vcfdir, "CEU_test.vcf.gz")
args.vcf = fname

# With only passing
args.only_passing = True
args.region = None
retcode = main(args)
assert retcode==0

# Test all the statSTR options
def test_Stats(args, vcfdir, capsys):
fname = os.path.join(vcfdir, "few_samples_few_loci.vcf.gz")
Expand Down
Loading