Skip to content

v1.0.1

Compare
Choose a tag to compare
@nh13 nh13 released this 14 Dec 18:24
· 8 commits to main since this release

What's Changed

  • Add authors to cargo toml by @nh13 in #4
  • Add bioconda to README by @nh13 in #5
  • feat(progress-logger): added a progress logger by @sstadick in #10
  • feat(pretty_progress_logger): uses commas to delimit large numbers by @sstadick in #12
  • Make fqgrep more grep-like in its options by @nh13 in #13
  • Add a rust-toolchain file in #15
  • Unit tests added by @samfulcrum in #16 and #18.

New Contributors

Full Changelog: v0.1.0...v1.0.1

Making it grep-like

Major refactor of the tool and code to make its command line and behavior very similar to unix grep.

  1. All reader, writer, and matching threads use a rayon thread pool. This means that --threads is respected. Previously reader and writer threads were always allocated outside the match pool, and there were specific arguments for the latter and compressing the output (the latter feature has been removed, plaintext FASTQ is the only output format, just pipe it if you need to).
  2. Takes in a pattern as the first positional argument, which is now a regular expression (previously a fixed string).
  3. Takes in zero or more file paths after the positional argument. Uses standard input if no file are given positionally or with -f below.
  4. Input files are assumed to be plain uncompressed FASTQs unless the --decompress option is given, in which case they're assumed to be GZIP compressed. This includes standard input. The exception are .gz/.bgz and.fastq/.fq which are always treated as GZIP compressed and plain text respectively.
  5. Implement the following options from grep:
  • -c, --count: simply return the count of matching records
  • -F, --fixed-strings: interpret pattern as a set of fixed strings
  • -v,--invert-match: Selected records are those not matching any of the specified patterns
  • --color <color>: color the output records with ANSI color codes
  • -e, --regexp <regexp>...: specify the pattern used during the search. Can be specified multiple times
  • -f, --file <file>: Read one or more newline separated patterns from file.
  • -Z, --decompress: treat all non .gz, .bgz, .fastq, and .fq files as GZIP compressed (default treat as uncompressed)
  1. The exit code follows GREP, where we exit with 0 if one or more lines were selected, 1 if no lines were selected, and >1 if an error occurred.

  2. Add non-grep options:

  • --paired: if one file or standard input is used, treat the input as an interleaved paired end FASTQ. If more than one file is given, ensure that the number of files are a multiple of two, and treat each consecutive pair of files as R1 and R2 respectively. If the pattern matches either R1 or R2, output both (interleaved FASTQ).
  • --reverse-complement: searches the reverse complement of the read sequences in addition
  • --progress: write progress (counts of records searched)
  • -t, --threads <threads>: see (1) above