Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: skip REF-only records #576

Merged
merged 4 commits into from
Oct 11, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 8 additions & 1 deletion src/annotate/strucvars/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1889,7 +1889,7 @@ pub trait VcfRecordConverter {
let mut end: Option<i32> = None;
let alleles = vcf_record.alternate_bases().as_ref();
if alleles.len() != 1 {
panic!("Only one alternative allele is supported for SVs");
panic!("Only one alternative allele is supported for SVs, got {} alternative alleles ({:?})", alleles.len(), alleles);
tedil marked this conversation as resolved.
Show resolved Hide resolved
}
let allele = &alleles[0];
// TODO find out how to handle this properly (via noodles?)
Expand Down Expand Up @@ -2959,6 +2959,13 @@ pub async fn run_vcf_to_jsonl(
rng.fill_bytes(&mut uuid_buf);
let uuid = Uuid::from_bytes(uuid_buf);

if record.alternate_bases().is_empty()
|| record.alternate_bases().as_ref() == ["<*>".to_string()]
{
// REF-only, skip
tracing::warn!("skipping REF-only / empty ALT record {:?}", record);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Avoid logging entire records to prevent potential sensitive data exposure

Logging the full record may inadvertently expose sensitive information. Consider logging only essential details, such as the record's position and reference sequence name, to avoid potential PII leakage.

Apply this diff to address the issue:

-                tracing::warn!("skipping REF-only / empty ALT record {:?}", record);
+                tracing::warn!(
+                    "skipping REF-only / empty ALT record at {}:{}",
+                    record.reference_sequence_name(),
+                    record.position()
+                );
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
tracing::warn!("skipping REF-only / empty ALT record {:?}", record);
tracing::warn!(
"skipping REF-only / empty ALT record at {}:{}",
record.reference_sequence_name(),
record.position()
);

continue;
}
let mut record = converter.convert(pedigree, &record, uuid, GenomeRelease::Grch37)?;
annotate_cov_mq(&mut record, cov_readers)?;
if let Some(chromosome_no) = mapping.get(&record.chromosome) {
Expand Down
Loading