Skip to content

Commit

Permalink
feat: implementing "seqvars prefilter" (#209)
Browse files Browse the repository at this point in the history
  • Loading branch information
holtgrewe committed Oct 9, 2023
1 parent a783ef3 commit 937cd12
Show file tree
Hide file tree
Showing 4 changed files with 70 additions and 0 deletions.
34 changes: 34 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ At the moment, the following sub commands exist:
- `db mk-inhouse` -- compile per-case structural variant into an in-house database previously created by `db compile`
- `seqvars` -- subcommands for processing sequence (aka small/SNV/indel) variants
- `seqvars ingest` -- convert single VCF file into internal format for use with `seqvars query`
- `seqvars prefilter` -- limit the result of `seqvars prefilter` by population frequency and/or distance to exon
- `seqvars query` -- perform sequence variant filtration and on-the-fly annotation
- `strucvars` -- subcommands for processing structural (aka large variants, CNVs, etc.) variants
- `strucvars ingest` -- convert one or more structural variant files for use with `strucvars query`
Expand Down Expand Up @@ -129,6 +130,39 @@ Overall, the command will emit the following header rows in addition to the `##c
> [!NOTE]
> Future versions of the worker will annotate the worst effect on a MANE select or MANE Clinical transcript.
## The `seqvars prefilter` Command

This file takes as the input a file created by `seqvars ingest` and filters the variants by population frequency and/or distance to exon.
You can pass the prefilter criteria as JSON on the command line corresponding to the following Rust structs:

```rust
struct PrefilterParams {
/// Path to output file.
pub path_out: String,
/// Maximal allele population frequency.
pub max_freq: f64,
/// Maximal distance to exon.
pub max_dist: i32,
}
```

You can either specify the parameters on the command line directly or pass a path to a JSONL file starting with `@`.
You can mix both ways.

```
$ varfish-server-worker strucvars prefilter \
--path-input INPUT.vcf \
--params '{"path_out": "out.vcf", "max_freq": 0.01, "max_dist": 100}' \
[--params ...] \
# OR
$ varfish-server-worker strucvars prefilter \
--path-input INPUT.vcf \
--params @path/to/params.json \
[--params ...] \
```

## The `strucvars ingest` Command

This command takes as the input one or more VCF files from structural variant callers and converts it into a file for further querying.
Expand Down
4 changes: 4 additions & 0 deletions src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,7 @@ struct Seqvars {
#[derive(Debug, Subcommand)]
enum SeqvarsCommands {
Ingest(seqvars::ingest::Args),
Prefilter(seqvars::prefilter::Args),
}

fn main() -> Result<(), anyhow::Error> {
Expand Down Expand Up @@ -121,6 +122,9 @@ fn main() -> Result<(), anyhow::Error> {
SeqvarsCommands::Ingest(args) => {
seqvars::ingest::run(&cli.common, args)?;
}
SeqvarsCommands::Prefilter(args) => {
seqvars::prefilter::run(&cli.common, args)?;
}
},
Commands::Strucvars(strucvars) => match &strucvars.command {
StrucvarsCommands::Ingest(args) => {
Expand Down
1 change: 1 addition & 0 deletions src/seqvars/mod.rs
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@
pub mod ingest;
pub mod prefilter;
31 changes: 31 additions & 0 deletions src/seqvars/prefilter/mod.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
//! Implementation of `seqvars prefilter` subcommand.
use crate::common;

/// Command line arguments for `seqvars prefilter` subcommand.
#[derive(Debug, clap::Parser)]
#[command(author, version, about = "prefilter an ingested variant VCF", long_about = None)]
pub struct Args {
/// Path to input file.
#[clap(long)]
pub path_in: String,
/// Prefilter parameters or @ with path to JSONL file.
#[clap(long)]
pub params: Vec<String>,
}

/// Main entry point for `seqvars prefilter` sub command.
pub fn run(args_common: &crate::common::Args, args: &Args) -> Result<(), anyhow::Error> {
let before_anything = std::time::Instant::now();
tracing::info!("args_common = {:#?}", &args_common);
tracing::info!("args = {:#?}", &args);

common::trace_rss_now();


tracing::info!(
"All of `seqvars ingest` completed in {:?}",
before_anything.elapsed()
);
Ok(())
}

0 comments on commit 937cd12

Please sign in to comment.