Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SoftClippedReadFilter Shows Filtering Result Opposite to Description. #8887

Closed
yixuanw99 opened this issue Jun 22, 2024 · 1 comment · Fixed by #8888
Closed

SoftClippedReadFilter Shows Filtering Result Opposite to Description. #8887

yixuanw99 opened this issue Jun 22, 2024 · 1 comment · Fixed by #8888

Comments

@yixuanw99
Copy link

yixuanw99 commented Jun 22, 2024

Bug Report

Affected tool(s) or class(es)

SoftClippedReadFilter

Affected version(s)

4.5.0.0

Description

According to the filter's description, setting the --soft-clipped-leading-trailing-ratio to 0.9 should mean that reads will be filtered out if over 90% of their bases are soft-clipped at either the beginning or end. Therefore, a higher value indicates a more lenient filter, resulting in fewer reads being excluded.

However, it seems that the current implementation retains reads with a ratio of 0.9 to 1.0 instead of excluding them, which is the opposite of what the description suggests. In practice, increasing the threshold from 0.3 to 0.6 and then to 0.9 results in more reads being filtered out, which is contrary to the expected behavior.

Steps to reproduce

  1. Increase the threshold of --soft-clipped-leading-trailing-ratio from 0.3 to 0.6, and then to 0.9.
  2. Observe that more reads are being filtered out with higher thresholds.

Refer to the attached log for detailed observations: SoftClippedReadFilter_test.log

Expected behavior

Filter out reads where the ratio of soft-clipped bases to total bases exceeds the given threshold. For example, set the threshold to 0.9 and filter out reads with a ratio > 0.9.

Actual behavior

Filter out reads where the ratio of soft-clipped bases to total bases is less than the given threshold. For example, set the threshold to 0.9 and filter out reads with a ratio < 0.9.

Simple Solution Proposal

I believe the issue might be resolved by inverting the comparison operators in the relevant sections of the code. Specifically:

  • Change the > to < in line 66 and line 95 of
    src/main/java/org/broadinstitute/hellbender/engine/filters/SoftClippedReadFilter.java

This change should make the test() function of the ReadFilter class return false when the ratio exceeds the threshold, aligning with the intended functionality where true means retaining the GATKRead in the ReadFilter.

@jamesemery
Copy link
Collaborator

jamesemery commented Jun 24, 2024

Hello @yixuanw99. This sounds like a problem that we shold look into. However, there is a workaround in master that should be in the next release. Namely we have introduced an --inverted-read-filter argument (#8724) that allows you to flip the logic of any read filter in the GATK easily which should serve as an easy workaround in this case. Furthermore the tool itself has a built in argument --invert-soft-clip-ratio-filter which should allow you to tailor the behavior to your liking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants