Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: max() arg is an empty sequence with amplicon (high coverage panels) data #103

Closed
naumenko-sa opened this issue Jun 10, 2021 · 6 comments

Comments

@naumenko-sa
Copy link

naumenko-sa commented Jun 10, 2021

Dear CRISPresso2 team!

Thanks for developing the tool and helping me so quickly with the WGS case!

Describe the bug
This time I am running CRISPRessoPooled for high coverage panel (~10000X, ~115 baits 121bp wide) data in Amplicons or Genome mode and
getting Error: max() arg is an empty sequence error.
I am using a docker image and it worked well with WGS data from another experiment.

Expected behavior
Aligning reads to the genome, discovering covered regions, calling editing events.

To reproduce

singularity run \
-e crispresso2_latest.sif \
CRISPRessoPooled \
-r1 ${1}_1.fq -r2 ${1}_2.fq \
-x /path/to/reference/mm10_plus/bowtie2/mm10_plus \
-p 10 -n ${1}_only_genome \
--debug

Debug output

Aligning reads to the provided genome index...
aligning with command: bowtie2 -x /path/to/reference/mm10_plus/bowtie2/mm10_plus -p 10  --end-to-end -N 0 --np 0 --mp 3,2 --score-min L,-5,-1.2  -U CRISPRessoPooled_on_sample_only_genome/out.extendedFrags.fastq.gz 2>>CRISPRessoPooled_on_sample_only_genome/CRISPRessoPooled_RUNNING_LOG.txt| samtools view -bS - | samtools sort -@ 10 - -o CRISPRessoPooled_on_sample_only_genome/sample_only_genome_GENOME_ALIGNED.bam
1681504 reads; of these:
  1681504 (100.00%) were unpaired; of these:
    673 (0.04%) aligned 0 times
    1607636 (95.61%) aligned exactly 1 time
    73195 (4.35%) aligned >1 times
99.96% overall alignment rate
Deleting partially-completed demultiplexing in CRISPRessoPooled_on_sample_only_genome/MAPPED_REGIONS/...
Preparing to demultiplex reads aligned to the genome...
Demultiplexing reads by location (67 genomic regions)...
Parsing the demultiplexed files and extracting locations and reference sequences...
Running CRISPResso on the regions discovered...
Running CRISPResso with 10 processes
Finished all regionsERROR: max() arg is an empty sequence

REPORT_READS_ALIGNED_TO_GENOME_ONLY.txt is empty. In the bam aligned with bwa mem I can see the reads aligned over the targets with >1000X coverage.

I also tried to extract the target sequences (baits) from the reference, split them into small chunks (40bp) and run the Pooled script - same result - none of the regions has enough reads. Reads are 138-140bp in lengths.

Could you please suggest what might be wrong in the setting up the parameters for this run?
Thanks!
Sergey

@kclem
Copy link
Member

kclem commented Jun 10, 2021

I believe this error could be coming from one of the sub CRISPResso commands run on each region. You can search for the file that contains this error by changing to the CRISPRessoPooled output directory and using the command grep "max() arg is an empty sequence" CRISPResso_on_*/*CRISPResso_RUNNING_LOG.txt. Does that command produce any output or point you to any files? Ideally I'm looking for a line number to trace this error back to so I can fix it.

@naumenko-sa
Copy link
Author

hello @kclem !

Thanks for the quick response!

I'm trying to process these data in 3 modes.

  1. Genome mode (-x)
    here all samples fail, there is no sub-folders in CRISPResso_on_sampleX. The only log file is
    CRISPResso_on_sampleX/CRISPRessoPooled_RUNNING_LOG.txt
    here is the tail of it:
Aligning reads to the provided genome index...
aligning with command: bowtie2 -x /path/to/bowtie2/mm10_plus -p 10  --end-to-end -N 0 --np 0 --mp 3,2 --score-min L,-5,-1.2  -U CRISPRessoPooled_on_sample_only_genome/out.extendedFrags.fastq.gz 2>>CRISPRessoPooled_on_sample_only_genome/CRISPRessoPooled_RUNNING_LOG.txt| samtools view -bS - | samtools sort -@ 10 - -o CRISPRessoPooled_on_sample_only_genome/sample_only_genome_GENOME_ALIGNED.bam
1681504 reads; of these:
  1681504 (100.00%) were unpaired; of these:
    673 (0.04%) aligned 0 times
    1607636 (95.61%) aligned exactly 1 time
    73195 (4.35%) aligned >1 times
99.96% overall alignment rate
Deleting partially-completed demultiplexing in CRISPRessoPooled_on_sample_only_genome/MAPPED_REGIONS/...
Preparing to demultiplex reads aligned to the genome...
Demultiplexing reads by location (67 genomic regions)...
Parsing the demultiplexed files and extracting locations and reference sequences...
Running CRISPResso on the regions discovered...
Running CRISPResso with 10 processes
Finished all regionsERROR: max() arg is an empty sequence

There is no dump or stack trace in the log, but there is one in the job output when running with --debug:

INFO  @ Thu, 10 Jun 2021 00:55:20:
         Parsing the demultiplexed files and extracting locations and reference sequences...
INFO  @ Thu, 10 Jun 2021 00:55:20:
         Running CRISPResso on the regions discovered...
INFO  @ Thu, 10 Jun 2021 00:55:20:
         Running CRISPResso with 10 processes
INFO  @ Thu, 10 Jun 2021 00:55:20:
         Finished all regions
CRITICAL @ Thu, 10 Jun 2021 00:55:21:
ERROR: max() arg is an empty sequenceTraceback (most recent call last):
  File "/opt/conda/lib/python2.7/site-packages/CRISPResso2-2.1.1-py2.7-linux-x86_64.egg/CRISPResso2/CRISPRessoPooledCORE.py", line 1316, in main
    CRISPRessoPlot.plot_reads_total(plot_root,df_summary_quantification,save_png,args.min_reads_to_use_region)
  File "/opt/conda/lib/python2.7/site-packages/CRISPResso2-2.1.1-py2.7-linux-x86_64.egg/CRISPResso2/CRISPRessoPlot.py", line 1734, in plot_reads_total
    if max(df['Reads_total'] > 100000):
ValueError: max() arg is an empty sequence
  1. Amplicon mode with narrow baits (40bp)
    same as 1, no subfolders,
    error message: 'Skipping the folder : not enough reads, incomplete, or empty folder'

  2. Amplicon mode with wide baits (>121bp, baits were sorted and merged if overlapping)

Most samples actually work!

But some samples give another error:

Align reads to the amplicons...
Alignment command: bowtie2 -x CRISPRessoPooled_on_sample_merged_baits/CUSTOM_BOWTIE2_INDEX -p 10  --end-to-end -N 0 --np 0 --mp 3,2 --score-min L,-5,-1.2  -U CRISPRessoPooled_on_sample_merged_baits/out.extendedFrags.fastq.gz 2>>CRISPRessoPooled_on_sample_merged_baits/CRISPRessoPooled_RUNNING_LOG.txt | samtools view -bS - > CRISPRessoPooled_on_sample_merged_baits/CRISPResso_AMPLICONS_ALIGNED.bam
2145972 reads; of these:
  2145972 (100.00%) were unpaired; of these:
    25746 (1.20%) aligned 0 times
    2119038 (98.74%) aligned exactly 1 time
    1188 (0.06%) aligned >1 times
98.80% overall alignment rate
ERROR: list index out of range
samtools view: writing to standard output failed: Broken pipe
samtools view: error closing standard output: -1

Sergey

@naumenko-sa
Copy link
Author

update:
for the samples and probes where CRISPResso2 generated a report, it reports 100% reads modified!
That is surprising, since in the WGS project there is always a fraction < 100%.

kclem added a commit that referenced this issue Jun 24, 2021
If no regions are returned, max of a pandas dataframe returns an error because the df is empty
@kclem
Copy link
Member

kclem commented Jun 24, 2021

Ok. I have fixed the ERROR: max() arg is an empty sequence bug -- but this isn't your main problem. Your main problem is that insufficient reads are aligning to the locations where you think they should be aligning.

CRISPRessoPooled was designed to analyze pooled amplicon sequencing experiments -- is that what your data is? PCR primers for each amplicon create reads that start and stop at exactly the same genomic position. If your reads don't start at the given start position, they will be discarded and you'll end up with an error or empty data as above.

In the genome-only mode (-x) check the produced bam file "CRISPRessoPooled_on_sample_only_genome/sample_only_genome_GENOME_ALIGNED.bam" to make sure that they are aligning to the proper sequences. Then, for one region which has sufficient coverage (>1000x) pull out the genomic region corresponding to one of the reads, and use that for CRISPRessoPooled in amplicon mode (running with this single amplicon). Is this single amplicon sequence one of the ones you provided in the amplicons mode? If not, there may be a problem with the way you define your amplicons. Does this work for you?

@naumenko-sa
Copy link
Author

Hi @kclem !

Thanks for the fix and the explanation!
You are right - this not amplicon sequencing, it is rather high coverage panels, i.e. reads don't start at the same position.

the file you mentioned CRISPRessoPooled_on_sample_only_genome/sample_only_genome_GENOME_ALIGNED.bam does have reads aligned to the probes.

So the solution would be just to trim all the reads on the same borders - that would make this data similar to amplicons!

Sergey

@kclem
Copy link
Member

kclem commented Jun 27, 2021

Yeah -- you should try CRISPRessoWGS -- this will automatically trim reads that completely overlap the query regions and analyze them using CRISPResso. Just make sure that your query regions are shorter than the read length (otherwise no reads will completely cover the query regions). Good luck!

@kclem kclem closed this as completed Jun 27, 2021
kclem pushed a commit that referenced this issue Oct 17, 2024
* Mckay/c2pro reports test (#99)

* Fix CRISPRessoAggregate bug and other improvements (#95)

* D3-Enhancements (#78)

* Sam/try plots (#71)

* Fix batch mode pandas warning. (#70)

* refactor to call method on DataFrame, rather than Series.
Removes warning.

* Fix pandas future warning in CRISPRessoWGS

---------



* Functional

* Cole/fix status file name (#69)

* Update config file logging messages

This removes printing the exception (which is essentially a duplicate),
and adds a condition if no config file was provided. Also changes `json`
to `config` so that it is more clear.

* Fix divide by zero when no amplicons are present in Batch mode

* Don't append file_prefix to status file name

* Place status files in output directories

* Update tests branch for file_prefix addition

* Load D3 and plotly figures with pro with multiple amplicons

* Update batch

* Fix bug in CRISPRessoCompare with pointing to report datas with file_prefix

Before this fix, when using a file_prefix the second run that was compared
would not be displayed as a data in the first figure of the report.

* Import CRISPRessoPro instead of importing the version

When installed via conda, the version is not available

* Remove `get_amplicon_output` unused function from CRISPRessoCompare

Also remove unused argparse import

* Implement `get_matching_allele_files` in CRISPRessoCompare and accompanying unit tests

* Allow for matching of multiple guides in the same amplicon

* Fix pandas FutureWarning

* Change test branch back to master

---------



* Try catch all futures

* Fix test fail plots

* Point test to try-plots

* Fix d3 not showing and plotly mixing with matplotlib

* Use logger for warnings and debug statements

* Point tests back at master

---------




* Sam/fix plots (#72)

* Fix batch mode pandas warning. (#70)

* refactor to call method on DataFrame, rather than Series.
Removes warning.

* Fix pandas future warning in CRISPRessoWGS

---------



* Functional

* Cole/fix status file name (#69)

* Update config file logging messages

This removes printing the exception (which is essentially a duplicate),
and adds a condition if no config file was provided. Also changes `json`
to `config` so that it is more clear.

* Fix divide by zero when no amplicons are present in Batch mode

* Don't append file_prefix to status file name

* Place status files in output directories

* Update tests branch for file_prefix addition

* Load D3 and plotly figures with pro with multiple amplicons

* Update batch

* Fix bug in CRISPRessoCompare with pointing to report datas with file_prefix

Before this fix, when using a file_prefix the second run that was compared
would not be displayed as a data in the first figure of the report.

* Import CRISPRessoPro instead of importing the version

When installed via conda, the version is not available

* Remove `get_amplicon_output` unused function from CRISPRessoCompare

Also remove unused argparse import

* Implement `get_matching_allele_files` in CRISPRessoCompare and accompanying unit tests

* Allow for matching of multiple guides in the same amplicon

* Fix pandas FutureWarning

* Change test branch back to master

---------



* Try catch all futures

* Fix test fail plots

* Fix d3 not showing and plotly mixing with matplotlib

---------




* Remove token from integration tests file

* Provide sgRNA_sequences to plot_nucleotide_quilt plots

* Passing sgRNA_sequences to plot

* Refactor check for determining when to use CRISPREssoPro or matplotlib for Batch plots

* Add max-height to Batch report samples

* Change testing branch

* Fix wrong check for large Batch plots

* Fix typo and move flexiguide to debug (#77)

* Change flexiguide output to debug level

* Fix typo in fastp merged output file name

* Adding id tags for d3 script enhancements

* pointing to test branch

* Add amplicon_name parameter to allele heatmap and line plots

* Add function to extract quantification window regions from include_idxs

* Scale the quantification window according to the coordinates of the sgRNA plot

* added c2pro check, added space in args.json

* Correct the quantification window indexes for multiple guides

* Fix name of nucleotide conversion plot when guides are not the same

* Fix jinja variables that aren't found

* Fix multiple guide errors where the wrong sgRNA sequence was associated in d3 plot

* Remove unneeded variable and extra whitespace

* Switch test branch to master

---------





* Add amplicon_name to plot functions

* Add sgRNA sequences to nucleotide quilt parameters in Aggregate

* Add custom_colors to Aggregate plot functions

* Update Aggregate and make_aggregate_report to have logger and tool

* Write command_used to Aggregate .json info file

* Point to new test branch and add Aggregate run

* Make the order of Aggregate runs explicit

* Sort all instances of crispresso2_folder_info in Aggregate

* Sort df_summary_quantification df in Aggregate

* Try sorting with a list of single column

* Update to correct test branch

* Move to master test branch

---------





* Squashed commit of the following:

commit 6ec98a05ee70f85b5aa0ac15ab6094b7f1f20d08
Author: mbowcut2 <[email protected]>
Date:   Tue Aug 13 16:44:39 2024 -0600

    dict key changes

commit 7cfd5acf06da4eb6f49453144ee1fed1e1488a7a
Author: mbowcut2 <[email protected]>
Date:   Thu Aug 8 15:30:31 2024 -0600

    added C2PRO install check back

commit bfb0003329ea61b5c79c7e1df8d9a73ec5a508db
Author: mbowcut2 <[email protected]>
Date:   Fri Aug 2 13:08:12 2024 -0600

    fixed key error conditionals

commit 84444e7480605206cb3efa4a0db675c55e717304
Author: mbowcut2 <[email protected]>
Date:   Fri Aug 2 09:22:44 2024 -0600

    use local jinja_paritals file

commit 71dd12786fec6c4aba0170a3bfd9022b06f5eede
Author: mbowcut2 <[email protected]>
Date:   Wed Jul 31 14:10:29 2024 -0600

    Squashed commit of the following:

    commit 5e3b30515c4bc437127e7fb21f53cb0bd511c4ca
    Author: Trevor Martin <[email protected]>
    Date:   Mon Jul 22 09:31:44 2024 -0600

        D3-Enhancements (#78)

        * Sam/try plots (#71)

        * Fix batch mode pandas warning. (#70)

        * refactor to call method on DataFrame, rather than Series.
        Removes warning.

        * Fix pandas future warning in CRISPRessoWGS

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        * Functional

        * Cole/fix status file name (#69)

        * Update config file logging messages

        This removes printing the exception (which is essentially a duplicate),
        and adds a condition if no config file was provided. Also changes `json`
        to `config` so that it is more clear.

        * Fix divide by zero when no amplicons are present in Batch mode

        * Don't append file_prefix to status file name

        * Place status files in output directories

        * Update tests branch for file_prefix addition

        * Load D3 and plotly figures with pro with multiple amplicons

        * Update batch

        * Fix bug in CRISPRessoCompare with pointing to report datas with file_prefix

        Before this fix, when using a file_prefix the second run that was compared
        would not be displayed as a data in the first figure of the report.

        * Import CRISPRessoPro instead of importing the version

        When installed via conda, the version is not available

        * Remove `get_amplicon_output` unused function from CRISPRessoCompare

        Also remove unused argparse import

        * Implement `get_matching_allele_files` in CRISPRessoCompare and accompanying unit tests

        * Allow for matching of multiple guides in the same amplicon

        * Fix pandas FutureWarning

        * Change test branch back to master

        ---------

        Co-authored-by: Sam <[email protected]>

        * Try catch all futures

        * Fix test fail plots

        * Point test to try-plots

        * Fix d3 not showing and plotly mixing with matplotlib

        * Use logger for warnings and debug statements

        * Point tests back at master

        ---------

        Co-authored-by: mbowcut2 <[email protected]>
        Co-authored-by: Cole Lyman <[email protected]>

        * Sam/fix plots (#72)

        * Fix batch mode pandas warning. (#70)

        * refactor to call method on DataFrame, rather than Series.
        Removes warning.

        * Fix pandas future warning in CRISPRessoWGS

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        * Functional

        * Cole/fix status file name (#69)

        * Update config file logging messages

        This removes printing the exception (which is essentially a duplicate),
        and adds a condition if no config file was provided. Also changes `json`
        to `config` so that it is more clear.

        * Fix divide by zero when no amplicons are present in Batch mode

        * Don't append file_prefix to status file name

        * Place status files in output directories

        * Update tests branch for file_prefix addition

        * Load D3 and plotly figures with pro with multiple amplicons

        * Update batch

        * Fix bug in CRISPRessoCompare with pointing to report datas with file_prefix

        Before this fix, when using a file_prefix the second run that was compared
        would not be displayed as a data in the first figure of the report.

        * Import CRISPRessoPro instead of importing the version

        When installed via conda, the version is not available

        * Remove `get_amplicon_output` unused function from CRISPRessoCompare

        Also remove unused argparse import

        * Implement `get_matching_allele_files` in CRISPRessoCompare and accompanying unit tests

        * Allow for matching of multiple guides in the same amplicon

        * Fix pandas FutureWarning

        * Change test branch back to master

        ---------

        Co-authored-by: Sam <[email protected]>

        * Try catch all futures

        * Fix test fail plots

        * Fix d3 not showing and plotly mixing with matplotlib

        ---------

        Co-authored-by: mbowcut2 <[email protected]>
        Co-authored-by: Cole Lyman <[email protected]>

        * Remove token from integration tests file

        * Provide sgRNA_sequences to plot_nucleotide_quilt plots

        * Passing sgRNA_sequences to plot

        * Refactor check for determining when to use CRISPREssoPro or matplotlib for Batch plots

        * Add max-height to Batch report samples

        * Change testing branch

        * Fix wrong check for large Batch plots

        * Fix typo and move flexiguide to debug (#77)

        * Change flexiguide output to debug level

        * Fix typo in fastp merged output file name

        * Adding id tags for d3 script enhancements

        * pointing to test branch

        * Add amplicon_name parameter to allele heatmap and line plots

        * Add function to extract quantification window regions from include_idxs

        * Scale the quantification window according to the coordinates of the sgRNA plot

        * added c2pro check, added space in args.json

        * Correct the quantification window indexes for multiple guides

        * Fix name of nucleotide conversion plot when guides are not the same

        * Fix jinja variables that aren't found

        * Fix multiple guide errors where the wrong sgRNA sequence was associated in d3 plot

        * Remove unneeded variable and extra whitespace

        * Switch test branch to master

        ---------

        Co-authored-by: Samuel Nichols <[email protected]>
        Co-authored-by: mbowcut2 <[email protected]>
        Co-authored-by: Cole Lyman <[email protected]>

    commit 09e5d9720ad21e44fc7916d71bde3fd7a9dfa7ef
    Author: Kendell Clement <[email protected]>
    Date:   Thu Jul 18 14:31:54 2024 -0600

        Asymmetrical cut point (#457)

        * add cut_point_ind to plot_alleles_heatmap for asymmetrical plotting

        * Cole asymmetrical cut point (#453)

        * Pin versions of numpy and matplotlib in CI environment (#84) (#452)

        * Reduce duplication and implement cut_point_ind in plot_alleles_heatmap_hist

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

    commit 8d92972694ddff629dad844a6ad100459f69751d
    Author: Cole Lyman <[email protected]>
    Date:   Thu Jul 18 14:29:40 2024 -0600

        Cole/update args (#85) (#456)

    commit 44f692ecabf5e2eb96ee0cfd7bae62343da7810c
    Author: Cole Lyman <[email protected]>
    Date:   Mon Jul 15 16:17:29 2024 -0600

        Implement new pooled mixed-mode default behavior (#454)

        * changes for pooled mixed-mode default (#83)

        * changes for pooled mixed-mode default

        * deprecated old arg

        * added integration tests for mixed mode

        * fixed test target

        * updated test name

        * pinned numpy

        * Fix integration tests yml

        * pinning matplotlib

        * added print to CI tests

        * changed mixed mode info string

        * Remove pooled-mixed-mode-align-to-genome step from Github Actions

        * Update demultiplex_genome_wide parameter and help

        * Convert args.json to unix line endings

        * Add Pooled mixed mode demux run

        * Update the name of the argument in Pooled

        * Point integration tests back to master

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        * Revert change to pooled mixed mode info statement (#86)

        ---------

        Co-authored-by: mbowcut2 <[email protected]>

    commit 79b482b55a0e8edbc03ec22bd2714bade1e90323
    Author: Cole Lyman <[email protected]>
    Date:   Tue Jul 9 12:53:23 2024 -0600

        Pin versions of numpy and matplotlib in CI environment (#84) (#452)

    commit 80dc1bdd72d50f989717bfc5f8156bc3495c45f4
    Author: Kendell Clement <[email protected]>
    Date:   Thu May 30 14:07:42 2024 -0600

        Add padding to image

    commit 381755daf0939aaf2745df0a802c809633aff47d
    Author: Kendell Clement <[email protected]>
    Date:   Thu May 30 13:59:57 2024 -0600

        White background for schematic for dark mode

    commit d649db71e610bd8840fbb8d46fadb07789b67390
    Author: Cole Lyman <[email protected]>
    Date:   Fri May 24 12:45:53 2024 -0600

        Fix typo and move flexiguide to debug (#77) (#438)

        * Change flexiguide output to debug level

        * Fix typo in fastp merged output file name

    commit 71181f50ef2b39015523b1a71d9fd1bf0dce14eb
    Author: Cole Lyman <[email protected]>
    Date:   Mon May 13 13:34:00 2024 -0600

        Prefix the release Docker tag with a `v` (#434)

    commit d2c2be18a6bb64b0e742cc24c4665980a24324bc
    Author: Cole Lyman <[email protected]>
    Date:   Mon May 13 09:41:32 2024 -0600

        Showing sgRNA sequences on hover in CRISPRessoPro (#432)

        * Passing sgRNA sequences to regular and Batch D3 plots (#73)

        * Sam/try plots (#71)

        * Fix batch mode pandas warning. (#70)

        * refactor to call method on DataFrame, rather than Series.
        Removes warning.

        * Fix pandas future warning in CRISPRessoWGS

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        * Functional

        * Cole/fix status file name (#69)

        * Update config file logging messages

        This removes printing the exception (which is essentially a duplicate),
        and adds a condition if no config file was provided. Also changes `json`
        to `config` so that it is more clear.

        * Fix divide by zero when no amplicons are present in Batch mode

        * Don't append file_prefix to status file name

        * Place status files in output directories

        * Update tests branch for file_prefix addition

        * Load D3 and plotly figures with pro with multiple amplicons

        * Update batch

        * Fix bug in CRISPRessoCompare with pointing to report datas with file_prefix

        Before this fix, when using a file_prefix the second run that was compared
        would not be displayed as a data in the first figure of the report.

        * Import CRISPRessoPro instead of importing the version

        When installed via conda, the version is not available

        * Remove `get_amplicon_output` unused function from CRISPRessoCompare

        Also remove unused argparse import

        * Implement `get_matching_allele_files` in CRISPRessoCompare and accompanying unit tests

        * Allow for matching of multiple guides in the same amplicon

        * Fix pandas FutureWarning

        * Change test branch back to master

        ---------

        Co-authored-by: Sam <[email protected]>

        * Try catch all futures

        * Fix test fail plots

        * Point test to try-plots

        * Fix d3 not showing and plotly mixing with matplotlib

        * Use logger for warnings and debug statements

        * Point tests back at master

        ---------

        Co-authored-by: mbowcut2 <[email protected]>
        Co-authored-by: Cole Lyman <[email protected]>

        * Sam/fix plots (#72)

        * Fix batch mode pandas warning. (#70)

        * refactor to call method on DataFrame, rather than Series.
        Removes warning.

        * Fix pandas future warning in CRISPRessoWGS

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        * Functional

        * Cole/fix status file name (#69)

        * Update config file logging messages

        This removes printing the exception (which is essentially a duplicate),
        and adds a condition if no config file was provided. Also changes `json`
        to `config` so that it is more clear.

        * Fix divide by zero when no amplicons are present in Batch mode

        * Don't append file_prefix to status file name

        * Place status files in output directories

        * Update tests branch for file_prefix addition

        * Load D3 and plotly figures with pro with multiple amplicons

        * Update batch

        * Fix bug in CRISPRessoCompare with pointing to report datas with file_prefix

        Before this fix, when using a file_prefix the second run that was compared
        would not be displayed as a data in the first figure of the report.

        * Import CRISPRessoPro instead of importing the version

        When installed via conda, the version is not available

        * Remove `get_amplicon_output` unused function from CRISPRessoCompare

        Also remove unused argparse import

        * Implement `get_matching_allele_files` in CRISPRessoCompare and accompanying unit tests

        * Allow for matching of multiple guides in the same amplicon

        * Fix pandas FutureWarning

        * Change test branch back to master

        ---------

        Co-authored-by: Sam <[email protected]>

        * Try catch all futures

        * Fix test fail plots

        * Fix d3 not showing and plotly mixing with matplotlib

        ---------

        Co-authored-by: mbowcut2 <[email protected]>
        Co-authored-by: Cole Lyman <[email protected]>

        * Remove token from integration tests file

        * Provide sgRNA_sequences to plot_nucleotide_quilt plots

        * Passing sgRNA_sequences to plot

        * Refactor check for determining when to use CRISPREssoPro or matplotlib for Batch plots

        * Add max-height to Batch report samples

        * Change testing branch

        * Fix wrong check for large Batch plots

        * Update integration_tests.yml to point back at master

        ---------

        Co-authored-by: Samuel Nichols <[email protected]>
        Co-authored-by: mbowcut2 <[email protected]>
        Co-authored-by: Cole Lyman <[email protected]>

        * Push new releases to ECR (#74)

        * Create aws_ecr.yml (#1)

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * us-east-1

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Fix d3 sgRNA sequences (#76)

        * Pass correct sgRNA_sequences to d3 plot

        * Pass correct sgRNA sequence to prime editor plot for d3

        * Resize plotly (#75)

        * Sam/try plots (#71)

        * Fix batch mode pandas warning. (#70)

        * refactor to call method on DataFrame, rather than Series.
        Removes warning.

        * Fix pandas future warning in CRISPRessoWGS

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        * Functional

        * Cole/fix status file name (#69)

        * Update config file logging messages

        This removes printing the exception (which is essentially a duplicate),
        and adds a condition if no config file was provided. Also changes `json`
        to `config` so that it is more clear.

        * Fix divide by zero when no amplicons are present in Batch mode

        * Don't append file_prefix to status file name

        * Place status files in output directories

        * Update tests branch for file_prefix addition

        * Load D3 and plotly figures with pro with multiple amplicons

        * Update batch

        * Fix bug in CRISPRessoCompare with pointing to report datas with file_prefix

        Before this fix, when using a file_prefix the second run that was compared
        would not be displayed as a data in the first figure of the report.

        * Import CRISPRessoPro instead of importing the version

        When installed via conda, the version is not available

        * Remove `get_amplicon_output` unused function from CRISPRessoCompare

        Also remove unused argparse import

        * Implement `get_matching_allele_files` in CRISPRessoCompare and accompanying unit tests

        * Allow for matching of multiple guides in the same amplicon

        * Fix pandas FutureWarning

        * Change test branch back to master

        ---------

        Co-authored-by: Sam <[email protected]>

        * Try catch all futures

        * Fix test fail plots

        * Point test to try-plots

        * Fix d3 not showing and plotly mixing with matplotlib

        * Use logger for warnings and debug statements

        * Point tests back at master

        ---------

        Co-authored-by: mbowcut2 <[email protected]>
        Co-authored-by: Cole Lyman <[email protected]>

        * Sam/fix plots (#72)

        * Fix batch mode pandas warning. (#70)

        * refactor to call method on DataFrame, rather than Series.
        Removes warning.

        * Fix pandas future warning in CRISPRessoWGS

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        * Functional

        * Cole/fix status file name (#69)

        * Update config file logging messages

        This removes printing the exception (which is essentially a duplicate),
        and adds a condition if no config file was provided. Also changes `json`
        to `config` so that it is more clear.

        * Fix divide by zero when no amplicons are present in Batch mode

        * Don't append file_prefix to status file name

        * Place status files in output directories

        * Update tests branch for file_prefix addition

        * Load D3 and plotly figures with pro with multiple amplicons

        * Update batch

        * Fix bug in CRISPRessoCompare with pointing to report datas with file_prefix

        Before this fix, when using a file_prefix the second run that was compared
        would not be displayed as a data in the first figure of the report.

        * Import CRISPRessoPro instead of importing the version

        When installed via conda, the version is not available

        * Remove `get_amplicon_output` unused function from CRISPRessoCompare

        Also remove unused argparse import

        * Implement `get_matching_allele_files` in CRISPRessoCompare and accompanying unit tests

        * Allow for matching of multiple guides in the same amplicon

        * Fix pandas FutureWarning

        * Change test branch back to master

        ---------

        Co-authored-by: Sam <[email protected]>

        * Try catch all futures

        * Fix test fail plots

        * Fix d3 not showing and plotly mixing with matplotlib

        ---------

        Co-authored-by: mbowcut2 <[email protected]>
        Co-authored-by: Cole Lyman <[email protected]>

        * Remove token from integration tests file

        * Pass div id for plotly

        * Remove debug

        ---------

        Co-authored-by: mbowcut2 <[email protected]>
        Co-authored-by: Cole Lyman <[email protected]>

        ---------

        Co-authored-by: Trevor Martin <[email protected]>
        Co-authored-by: Samuel Nichols <[email protected]>
        Co-authored-by: mbowcut2 <[email protected]>

    commit 1c504274818b6b17fb60620d48fd92cb2e50566d
    Author: Cole Lyman <[email protected]>
    Date:   Thu May 9 14:16:25 2024 -0600

        Fix plots and improve plot error handling (#431)

        * Sam/try plots (#71)

        * Fix batch mode pandas warning. (#70)

        * refactor to call method on DataFrame, rather than Series.
        Removes warning.

        * Fix pandas future warning in CRISPRessoWGS

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        * Functional

        * Cole/fix status file name (#69)

        * Update config file logging messages

        This removes printing the exception (which is essentially a duplicate),
        and adds a condition if no config file was provided. Also changes `json`
        to `config` so that it is more clear.

        * Fix divide by zero when no amplicons are present in Batch mode

        * Don't append file_prefix to status file name

        * Place status files in output directories

        * Update tests branch for file_prefix addition

        * Load D3 and plotly figures with pro with multiple amplicons

        * Update batch

        * Fix bug in CRISPRessoCompare with pointing to report datas with file_prefix

        Before this fix, when using a file_prefix the second run that was compared
        would not be displayed as a data in the first figure of the report.

        * Import CRISPRessoPro instead of importing the version

        When installed via conda, the version is not available

        * Remove `get_amplicon_output` unused function from CRISPRessoCompare

        Also remove unused argparse import

        * Implement `get_matching_allele_files` in CRISPRessoCompare and accompanying unit tests

        * Allow for matching of multiple guides in the same amplicon

        * Fix pandas FutureWarning

        * Change test branch back to master

        ---------

        Co-authored-by: Sam <[email protected]>

        * Try catch all futures

        * Fix test fail plots

        * Point test to try-plots

        * Fix d3 not showing and plotly mixing with matplotlib

        * Use logger for warnings and debug statements

        * Point tests back at master

        ---------

        Co-authored-by: mbowcut2 <[email protected]>
        Co-authored-by: Cole Lyman <[email protected]>

        * Sam/fix plots (#72)

        * Fix batch mode pandas warning. (#70)

        * refactor to call method on DataFrame, rather than Series.
        Removes warning.

        * Fix pandas future warning in CRISPRessoWGS

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        * Functional

        * Cole/fix status file name (#69)

        * Update config file logging messages

        This removes printing the exception (which is essentially a duplicate),
        and adds a condition if no config file was provided. Also changes `json`
        to `config` so that it is more clear.

        * Fix divide by zero when no amplicons are present in Batch mode

        * Don't append file_prefix to status file name

        * Place status files in output directories

        * Update tests branch for file_prefix addition

        * Load D3 and plotly figures with pro with multiple amplicons

        * Update batch

        * Fix bug in CRISPRessoCompare with pointing to report datas with file_prefix

        Before this fix, when using a file_prefix the second run that was compared
        would not be displayed as a data in the first figure of the report.

        * Import CRISPRessoPro instead of importing the version

        When installed via conda, the version is not available

        * Remove `get_amplicon_output` unused function from CRISPRessoCompare

        Also remove unused argparse import

        * Implement `get_matching_allele_files` in CRISPRessoCompare and accompanying unit tests

        * Allow for matching of multiple guides in the same amplicon

        * Fix pandas FutureWarning

        * Change test branch back to master

        ---------

        Co-authored-by: Sam <[email protected]>

        * Try catch all futures

        * Fix test fail plots

        * Fix d3 not showing and plotly mixing with matplotlib

        ---------

        Co-authored-by: mbowcut2 <[email protected]>
        Co-authored-by: Cole Lyman <[email protected]>

        * Remove token from integration tests file

        ---------

        Co-authored-by: Samuel Nichols <[email protected]>
        Co-authored-by: mbowcut2 <[email protected]>

    commit acb2ea8e26dff4cd11f71301b344f81b1cec9040
    Author: Kendell Clement <[email protected]>
    Date:   Thu May 2 13:49:33 2024 -0600

        Use recent docker image for CircleCI testing that includes updated pandas

    commit 38fd76dbd7ce2087468f9f454b548777de959a68
    Author: Cole Lyman <[email protected]>
    Date:   Wed May 1 16:42:28 2024 -0600

        Cole/fix status file name (#69) (#430)

        * Update config file logging messages

        This removes printing the exception (which is essentially a duplicate),
        and adds a condition if no config file was provided. Also changes `json`
        to `config` so that it is more clear.

        * Fix divide by zero when no amplicons are present in Batch mode

        * Don't append file_prefix to status file name

        * Place status files in output directories

        * Update tests branch for file_prefix addition

        * Load D3 and plotly figures with pro with multiple amplicons

        * Update batch

        * Fix bug in CRISPRessoCompare with pointing to report datas with file_prefix

        Before this fix, when using a file_prefix the second run that was compared
        would not be displayed as a data in the first figure of the report.

        * Import CRISPRessoPro instead of importing the version

        When installed via conda, the version is not available

        * Remove `get_amplicon_output` unused function from CRISPRessoCompare

        Also remove unused argparse import

        * Implement `get_matching_allele_files` in CRISPRessoCompare and accompanying unit tests

        * Allow for matching of multiple guides in the same amplicon

        * Fix pandas FutureWarning

        * Change test branch back to master

        ---------

        Co-authored-by: Sam <[email protected]>

    commit 3ec22e5fd09e432c9997d30e5f9ee51a2cc00d7b
    Author: Kendell Clement <[email protected]>
    Date:   Wed May 1 13:08:11 2024 -0600

        Remove linked space in readme

    commit 340a4e16795a5e500411e11572ec267525985009
    Author: Cole Lyman <[email protected]>
    Date:   Wed May 1 13:07:14 2024 -0600

        Fix batch mode pandas warning. (#70) (#429)

        * refactor to call method on DataFrame, rather than Series.
        Removes warning.

        * Fix pandas future warning in CRISPRessoWGS

        ---------

        Co-authored-by: mbowcut2 <[email protected]>

    commit 1bc9e906f0ded81f80761d1ec375ee50a4f882a9
    Author: Cole Lyman <[email protected]>
    Date:   Fri Apr 26 16:26:27 2024 -0600

        Bump version to 2.3.1 and change default CRISPRessoPooled behavior to change in 2.3.2 (#428)

    commit 5638a1f6ffa973231f23422e9c757fa8cd4af7cc
    Author: Kendell Clement <[email protected]>
    Date:   Wed Apr 24 18:00:43 2024 -0600

        Spelling fixes

    commit d6011f29db16d8fc1c1e7222457b7f9a1f671de6
    Author: Cole Lyman <[email protected]>
    Date:   Wed Apr 24 09:33:53 2024 -0600

        Extract `jinja_partials` and fix CRISPRessoPooled fastp errors (#425)

        * Updated README (#64)

        * Updating README to fix argument, email, and formatting

        * removing superfluous files

        * Add link to CRISPRessoPro, move CRISPRessoPro section to end, and fix JSON formatting

        * Remove link to CRISPRessoPro

        * Replace Docker badge with link to tags

        * Add bullet points to Guardrails section and improve formatting

        * Fix typo and removed colons from guardrails

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        * Extract jinja_partials  (#65)

        * Extract jinja_partials code

        * Remove Plotly dependency from setup.py

        * Fix CRISPRessoPooled flash errors (#68)

        * Fix replacing flash intermediate files with fastp intermediate files

        This also moves where the files are added to `files_to_remove` up to
        near where they are created.

        * Update to run test branch with paired end Pooled test

        * Add pooled-paired-sim test to integration tests

        * Replace flash and trimmomatic with fastp and remove plotly from Github Actions environment

        * Change test branch back to master

        ---------

        Co-authored-by: Trevor Martin <[email protected]>

    commit f4858a30c43374f54058b3ad9c1e965e1ab7fb46
    Author: Cole Lyman <[email protected]>
    Date:   Tue Apr 23 17:00:28 2024 -0600

        Updated README (#64) (#424)

        * Updating README to fix argument, email, and formatting

        * removing superfluous files

        * Add link to CRISPRessoPro, move CRISPRessoPro section to end, and fix JSON formatting

        * Remove link to CRISPRessoPro

        * Replace Docker badge with link to tags

        * Add bullet points to Guardrails section and improve formatting

        * Fix typo and removed colons from guardrails

        ---------

        Co-authored-by: Trevor Martin <[email protected]>

    commit c3dbff0fccd44b0b1a9c246dd2aa629ddc515787
    Author: Kendell Clement <[email protected]>
    Date:   Mon Apr 22 11:24:59 2024 -0600

        Update CRISPRessoPooledCORE.py (#423)

        Fix bug in error reporting if duplicate names are present

    commit 20903c14877e5166b1b8a7b50b8fcab450ea3ca6
    Author: Cole Lyman <[email protected]>
    Date:   Thu Apr 18 16:55:39 2024 -0600

        Remove extra imports from CRISPRessoCore (#67) (#422)

    commit 4aae57e5be475cd717792265bee36a71a99425de
    Author: Cole Lyman <[email protected]>
    Date:   Thu Apr 18 10:00:19 2024 -0600

        Cole/refactor jinja undefined (#66) (#421)

        * Replace Jinja2 PackageLoader with FileSystemLoader

        The PackageLoader doesn't work with a fairly recent version of Jinja2 (3.0.1)
        and Python 3.9. Replacing with FileSystemLoader work with the older version and
        the latest version.

        * Fix undefined variable `amplicon_name` in report template

        * Refactor logging Jinja2 undefined variable warnings

        * Revert plot_11a update

        * Update intedration test branch

        * Update jinja to warn on undefined but not fail. Fix all undefined warnings

        * Fix github integration tests ref

        * One more undefined variable

        ---------

        Co-authored-by: Samuel Nichols <[email protected]>

    commit 768c3c05bf1786a2a32e135b6e145cd6503c3db1
    Author: Cole Lyman <[email protected]>
    Date:   Tue Apr 9 17:30:10 2024 -0600

        Fix Jinja2 undefined variables (#63) (#417)

        * Replace Jinja2 PackageLoader with FileSystemLoader

        The PackageLoader doesn't work with a fairly recent version of Jinja2 (3.0.1)
        and Python 3.9. Replacing with FileSystemLoader work with the older version and
        the latest version.

        * Fix undefined variable `amplicon_name` in report template

        * Revert plot_11a update

        * Update intedration test branch

        * Update branch for integration tests

    commit 7e18f08cc1ac5f247a0fd1bbb394ccd9b0a07c2e
    Author: Han Dai <[email protected]>
    Date:   Fri Apr 5 18:36:41 2024 -0400

        fix: change all U+00A0 to U+0020 (#400)

    commit 235dc29c0cd0fcca2e999148d4660acf00b07221
    Author: Cole Lyman <[email protected]>
    Date:   Fri Apr 5 16:36:16 2024 -0600

        Fastp, args as data, guardrails, and PE fix (#415)

        * Change CRISPResso_status.txt format to JSON (#46)

        * Bug Fix - 367 (#35)

        * - Fixed references to ref_names_for_pe

        * removed extra tabs

        * trying to match empty line, no tabs

        * - changed references to ref_names[0]

        * Mckay/pd warnings (#45)

        * refactor errors='ignore' to try except

        * refactored integer slice to iloc[]

        * moved to_numeric try except to function

        * Refactor to_numeric_ignore_errors to to_numeric_ignore_columns

        This change is slightly cleaner because it addresses the root issue that some
        columns are strings (and can therefore not be converted to numeric types). Now
        if an error does occur when converting the dfs to numeric types it won't be
        swallowed up.

        * Add documentation to to_numeric_ignore_columns

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        * add json read for status file

        * changed Formatter to json format

        * fixed json access variable name: message

        * changed  perentage_complete to numeric

        * changed status file to .json

        * Create integration_tests.yml

        * Simplify name

        * CRISPRESSO2_DIR environment variable

        * Up one dir

        * ls workspace

        * Install CRISPResso and ydiff

        * Clone repo instead of checkout

        * submodule

        * ls

        * CRISPResso2_copy

        * ls

        * Update env

        * Simplify

        * Pull from githubactions branch

        * Pull githubactions repo

        * Checkout githubactions

        * Run tests individually

        * Pin plotly version

        * Run all tests even if one fails

        * Test on another branch

        * Switch branch with token

        * Update integration_tests.yml

        * New makefile commands

        * changed file to .json

        * changed status to json file

        * Make JSON human readable by adding new lines

        * GitHub actions integration tests (#48)

        * GitHub actions clean (#40)

        * Create pytest.yml

        * Create pylint.yml

        * Create .pylintrc

        * Create test_env.yml

        * Full path

        * Remove conda install

        * Replace path

        * Pytest tests

        * pip -e

        * Create integration_tests.yml

        * Simplify name

        * CRISPRESSO2_DIR environment variable

        * Up one dir

        * ls workspace

        * Install CRISPResso and ydiff

        * Clone repo instead of checkout

        * submodule

        * ls

        * CRISPResso2_copy

        * ls

        * Update env

        * Simplify

        * Pull from githubactions branch

        * Pull githubactions repo

        * Checkout githubactions

        * Mckay/pd warnings (#45)

        * refactor errors='ignore' to try except

        * refactored integer slice to iloc[]

        * moved to_numeric try except to function

        * Refactor to_numeric_ignore_errors to to_numeric_ignore_columns

        This change is slightly cleaner because it addresses the root issue that some
        columns are strings (and can therefore not be converted to numeric types). Now
        if an error does occur when converting the dfs to numeric types it won't be
        swallowed up.

        * Add documentation to to_numeric_ignore_columns

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        * Run tests individually

        * Pin plotly version

        * Run all tests even if one fails

        * Test on another branch

        * Switch branch with token

        * Update integration_tests.yml

        * Introduce pandas sorting in CRISPRessoCompare (#47)

        * New makefile commands

        * Fix interleaved fastq input in CRISPRessoPooled and suppress CRISPRessoWGS params (#42)

        * Extract out split_interleaved_fastq function to CRISPRessoShared

        * Implement splitting interleaved fastq files in CRISPRessoPooled

        * Suppress split_interleaved_input from CRISPRessoWGS parameters

        * Suppress other parameters in CRISPRessoWGS

        * Move where interleaved fastq files are split to be trimmed properly

        * Bug Fix - 367 (#35)

        * - Fixed references to ref_names_for_pe

        * removed extra tabs

        * trying to match empty line, no tabs

        * - changed references to ref_names[0]

        * Mckay/pd warnings (#45)

        * refactor errors='ignore' to try except

        * refactored integer slice to iloc[]

        * moved to_numeric try except to function

        * Refactor to_numeric_ignore_errors to to_numeric_ignore_columns

        This change is slightly cleaner because it addresses the root issue that some
        columns are strings (and can therefore not be converted to numeric types). Now
        if an error does occur when converting the dfs to numeric types it won't be
        swallowed up.

        * Add documentation to to_numeric_ignore_columns

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        * On push no branches

        * On push no branches

        * All in one file

        * Fix yml errors

        * Rename jobs

        * Remove old workflow files

        * Remove paths

        * Run jobs in parallel

        ---------

        Co-authored-by: mbowcut2 <[email protected]>
        Co-authored-by: Cole Lyman <[email protected]>

        * Move read filtering to after merging in CRISPResso (#39)

        * Move read filtering to after merging

        This is in an effort to be consistent with the behavior and results of
        CRISPRessoPooled.

        * Properly assign the correct file names for read filtering

        * Add space around operators

        * GitHub actions on pr (#51)

        * Run integration tests on pull_request

        * Run pytest on pull_request

        * Run pylint on pull_request

        * Run tests on PR only when opening PR (#53)

        * Update reports (#52)

        * Update report changes

        * Switch branch of integration test repo

        * Remove extraneous `crispresso_data_path`

        * Point integration tests back to master

        * point to test branch

        * pointed CI config to testing branch

        * Update integration_tests.yml

        point to master

        ---------

        Co-authored-by: Cole Lyman <[email protected]>
        Co-authored-by: Samuel Nichols <[email protected]>

        * Trevor/fastp integration (#50)

        * Update check_program to check versions and create check_fastq function

        * Update fastq arg, implement fastp in get_most_frequent_reads

        * Bump version to 2.3.0

        * Deprecate Flash and Trimmomatic parameters, and update fastp params

        * Update guess_amplicons and guess_guides to remove max_paired_end_reads_overlap

        * Implement trimming of single end reads

        * Merge (and trim) reads in CRISPRessoCORE with fastp

        * Modify error handling to account for fastp errors

        * Replace flash and trimmomatic with fastp in Docker dependencies

        * Update LICENSE.txt with fastp info

        * Remove min and max amplicon length (no longer needed)

        * Mckay/pd warnings (#45)

        * refactor errors='ignore' to try except

        * refactored integer slice to iloc[]

        * moved to_numeric try except to function

        * Refactor to_numeric_ignore_errors to to_numeric_ignore_columns

        This change is slightly cleaner because it addresses the root issue that some
        columns are strings (and can therefore not be converted to numeric types). Now
        if an error does occur when converting the dfs to numeric types it won't be
        swallowed up.

        * Add documentation to to_numeric_ignore_columns

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        * Implement trimming with fastp in CRISPRessoPooled

        * Implemend merging (and trimming) with fastp in CRISPRessoPooled

        * Fixed minor fastp errors

        * Move read filtering to after merging in CRISPResso (#39)

        * Move read filtering to after merging

        This is in an effort to be consistent with the behavior and results of
        CRISPRessoPooled.

        * Properly assign the correct file names for read filtering

        * Add space around operators

        * GitHub actions on pr (#51)

        * Run integration tests on pull_request

        * Run pytest on pull_request

        * Run pylint on pull_request

        * Run tests on PR only when opening PR (#53)

        * Update reports (#52)

        * Update report changes

        * Switch branch of integration test repo

        * Remove extraneous `crispresso_data_path`

        * Point integration tests back to master

        * Update where the test point to

        * Fix 'Prime-edited' key not found (#32)

        * Move 'Prime-edited' amplicon name check

        By moving this, it will check if there is an amplicon named
        'Prime-edited' (which is a reserved name) even if the
        `prime_editing_pegRNA_extension_seq` parameter is empty.

        * Only search for scaffold integration when pegRNA extension seq is provided

        * Remove spaces at the end of lines

        * Docker size (#49)

        * Bug Fix - 367 (#35)

        * - Fixed references to ref_names_for_pe

        * removed extra tabs

        * trying to match empty line, no tabs

        * - changed references to ref_names[0]

        * Mckay/pd warnings (#45)

        * refactor errors='ignore' to try except

        * refactored integer slice to iloc[]

        * moved to_numeric try except to function

        * Refactor to_numeric_ignore_errors to to_numeric_ignore_columns

        This change is slightly cleaner because it addresses the root issue that some
        columns are strings (and can therefore not be converted to numeric types). Now
        if an error does occur when converting the dfs to numeric types it won't be
        swallowed up.

        * Add documentation to to_numeric_ignore_columns

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        * GitHub actions integration tests (#48)

        * GitHub actions clean (#40)

        * Create pytest.yml

        * Create pylint.yml

        * Create .pylintrc

        * Create test_env.yml

        * Full path

        * Remove conda install

        * Replace path

        * Pytest tests

        * pip -e

        * Create integration_tests.yml

        * Simplify name

        * CRISPRESSO2_DIR environment variable

        * Up one dir

        * ls workspace

        * Install CRISPResso and ydiff

        * Clone repo instead of checkout

        * submodule

        * ls

        * CRISPResso2_copy

        * ls

        * Update env

        * Simplify

        * Pull from githubactions branch

        * Pull githubactions repo

        * Checkout githubactions

        * Mckay/pd warnings (#45)

        * refactor errors='ignore' to try except

        * refactored integer slice to iloc[]

        * moved to_numeric try except to function

        * Refactor to_numeric_ignore_errors to to_numeric_ignore_columns

        This change is slightly cleaner because it addresses the root issue that some
        columns are strings (and can therefore not be converted to numeric types). Now
        if an error does occur when converting the dfs to numeric types it won't be
        swallowed up.

        * Add documentation to to_numeric_ignore_columns

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        * Run tests individually

        * Pin plotly version

        * Run all tests even if one fails

        * Test on another branch

        * Switch branch with token

        * Update integration_tests.yml

        * Introduce pandas sorting in CRISPRessoCompare (#47)

        * New makefile commands

        * Fix interleaved fastq input in CRISPRessoPooled and suppress CRISPRessoWGS params (#42)

        * Extract out split_interleaved_fastq function to CRISPRessoShared

        * Implement splitting interleaved fastq files in CRISPRessoPooled

        * Suppress split_interleaved_input from CRISPRessoWGS parameters

        * Suppress other parameters in CRISPRessoWGS

        * Move where interleaved fastq files are split to be trimmed properly

        * Bug Fix - 367 (#35)

        * - Fixed references to ref_names_for_pe

        * removed extra tabs

        * trying to match empty line, no tabs

        * - changed references to ref_names[0]

        * Mckay/pd warnings (#45)

        * refactor errors='ignore' to try except

        * refactored integer slice to iloc[]

        * moved to_numeric try except to function

        * Refactor to_numeric_ignore_errors to to_numeric_ignore_columns

        This change is slightly cleaner because it addresses the root issue that some
        columns are strings (and can therefore not be converted to numeric types). Now
        if an error does occur when converting the dfs to numeric types it won't be
        swallowed up.

        * Add documentation to to_numeric_ignore_columns

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        * On push no branches

        * On push no branches

        * All in one file

        * Fix yml errors

        * Rename jobs

        * Remove old workflow files

        * Remove paths

        * Run jobs in parallel

        ---------

        Co-authored-by: mbowcut2 <[email protected]>
        Co-authored-by: Cole Lyman <[email protected]>

        * 3.4->2.08

        * Put ttf-mscorefonts-installer back above apt-get clean

        * restore slash, replace fastp with trimmomatic and flash, add autoremove step

        ---------

        Co-authored-by: mbowcut2 <[email protected]>
        Co-authored-by: Cole Lyman <[email protected]>

        * initial readme modifications

        * Updated readme to remove deprecated commands, updated help text to reflect new version and fastp

        * Pointing test branch back at master

        ---------

        Co-authored-by: Cole Lyman <[email protected]>
        Co-authored-by: mbowcut2 <[email protected]>
        Co-authored-by: Samuel Nichols <[email protected]>

        * Guardrails clean history (#34)

        * Include guardrail functions

        * Add CRISPRessoReports subtree

        * Refactor to use CRISPRessoReports module

        * Include guardrail functions

        * Functional guardrails, needs reports update

        * Add guardrail partial

        * fix guardrials partial

        * Bug Fix - 367 (#35)

        * - Fixed references to ref_names_for_pe

        * removed extra tabs

        * trying to match empty line, no tabs

        * - changed references to ref_names[0]

        * Mckay/pd warnings (#45)

        * refactor errors='ignore' to try except

        * refactored integer slice to iloc[]

        * moved to_numeric try except to function

        * Refactor to_numeric_ignore_errors to to_numeric_ignore_columns

        This change is slightly cleaner because it addresses the root issue that some
        columns are strings (and can therefore not be converted to numeric types). Now
        if an error does occur when converting the dfs to numeric types it won't be
        swallowed up.

        * Add documentation to to_numeric_ignore_columns

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        * GitHub actions integration tests (#48)

        * GitHub actions clean (#40)

        * Create pytest.yml

        * Create pylint.yml

        * Create .pylintrc

        * Create test_env.yml

        * Full path

        * Remove conda install

        * Replace path

        * Pytest tests

        * pip -e

        * Create integration_tests.yml

        * Simplify name

        * CRISPRESSO2_DIR environment variable

        * Up one dir

        * ls workspace

        * Install CRISPResso and ydiff

        * Clone repo instead of checkout

        * submodule

        * ls

        * CRISPResso2_copy

        * ls

        * Update env

        * Simplify

        * Pull from githubactions branch

        * Pull githubactions repo

        * Checkout githubactions

        * Mckay/pd warnings (#45)

        * refactor errors='ignore' to try except

        * refactored integer slice to iloc[]

        * moved to_numeric try except to function

        * Refactor to_numeric_ignore_errors to to_numeric_ignore_columns

        This change is slightly cleaner because it addresses the root issue that some
        columns are strings (and can therefore not be converted to numeric types). Now
        if an error does occur when converting the dfs to numeric types it won't be
        swallowed up.

        * Add documentation to to_numeric_ignore_columns

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        * Run tests individually

        * Pin plotly version

        * Run all tests even if one fails

        * Test on another branch

        * Switch branch with token

        * Update integration_tests.yml

        * Introduce pandas sorting in CRISPRessoCompare (#47)

        * New makefile commands

        * Fix interleaved fastq input in CRISPRessoPooled and suppress CRISPRessoWGS params (#42)

        * Extract out split_interleaved_fastq function to CRISPRessoShared

        * Implement splitting interleaved fastq files in CRISPRessoPooled

        * Suppress split_interleaved_input from CRISPRessoWGS parameters

        * Suppress other parameters in CRISPRessoWGS

        * Move where interleaved fastq files are split to be trimmed properly

        * Bug Fix - 367 (#35)

        * - Fixed references to ref_names_for_pe

        * removed extra tabs

        * trying to match empty line, no tabs

        * - changed references to ref_names[0]

        * Mckay/pd warnings (#45)

        * refactor errors='ignore' to try except

        * refactored integer slice to iloc[]

        * moved to_numeric try except to function

        * Refactor to_numeric_ignore_errors to to_numeric_ignore_columns

        This change is slightly cleaner because it addresses the root issue that some
        columns are strings (and can therefore not be converted to numeric types). Now
        if an error does occur when converting the dfs to numeric types it won't be
        swallowed up.

        * Add documentation to to_numeric_ignore_columns

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        * On push no branches

        * On push no branches

        * All in one file

        * Fix yml errors

        * Rename jobs

        * Remove old workflow files

        * Remove paths

        * Run jobs in parallel

        ---------

        Co-authored-by: mbowcut2 <[email protected]>
        Co-authored-by: Cole Lyman <[email protected]>

        * Update C cythonized files

        * Add exact numbers to guardrails printouts

        * Remove extraneous whitespace from CRISPRessoCOREResources.pyx

        * Fix calculation of `total_mods` from being negative

        The issue was that `all_deletion_coordinates` just tells you how many deletions
        were present, but not how long the deletion is.

        * Changes to message

        * Remove old tag

        * Point tests at guardrails

        * Restore C2 pro check

        * Save message with guardrail name

        * Point tests repo at master

        ---------

        Co-authored-by: Cole Lyman <[email protected]>
        Co-authored-by: mbowcut2 <[email protected]>

        * Fix case sensitivity in Prime Editing mode (#54)

        * Move read filtering to after merging in CRISPResso (#39)

        * Move read filtering to after merging

        This is in an effort to be consistent with the behavior and results of
        CRISPRessoPooled.

        * Properly assign the correct file names for read filtering

        * Add space around operators

        * GitHub actions on pr (#51)

        * Run integration tests on pull_request

        * Run pytest on pull_request

        * Run pylint on pull_request

        * Run tests on PR only when opening PR (#53)

        * Update reports (#52)

        * Update report changes

        * Switch branch of integration test repo

        * Remove extraneous `crispresso_data_path`

        * Point integration tests back to master

        * Make all amplicons in amplicon_seq_arr uppercase

        This fixes https://github.com/pinellolab/CRISPResso2/issues/396

        * Allow RNA values to be provided for prime_editing_pegRNA_scaffold_seq

        * Fix 'Prime-edited' key not found (#32)

        * Move 'Prime-edited' amplicon name check

        By moving this, it will check if there is an amplicon named
        'Prime-edited' (which is a reserved name) even if the
        `prime_editing_pegRNA_extension_seq` parameter is empty.

        * Only search for scaffold integration when pegRNA extension seq is provided

        * Remove spaces at the end of lines

        * Docker size (#49)

        * Bug Fix - 367 (#35)

        * - Fixed references to ref_names_for_pe

        * removed extra tabs

        * trying to match empty line, no tabs

        * - changed references to ref_names[0]

        * Mckay/pd warnings (#45)

        * refactor errors='ignore' to try except

        * refactored integer slice to iloc[]

        * moved to_numeric try except to function

        * Refactor to_numeric_ignore_errors to to_numeric_ignore_columns

        This change is slightly cleaner because it addresses the root issue that some
        columns are strings (and can therefore not be converted to numeric types). Now
        if an error does occur when converting the dfs to numeric types it won't be
        swallowed up.

        * Add documentation to to_numeric_ignore_columns

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        * GitHub actions integration tests (#48)

        * GitHub actions clean (#40)

        * Create pytest.yml

        * Create pylint.yml

        * Create .pylintrc

        * Create test_env.yml

        * Full path

        * Remove conda install

        * Replace path

        * Pytest tests

        * pip -e

        * Create integration_tests.yml

        * Simplify name

        * CRISPRESSO2_DIR environment variable

        * Up one dir

        * ls workspace

        * Install CRISPResso and ydiff

        * Clone repo instead of checkout

        * submodule

        * ls

        * CRISPResso2_copy

        * ls

        * Update env

        * Simplify

        * Pull from githubactions branch

        * Pull githubactions repo

        * Checkout githubactions

        * Mckay/pd warnings (#45)

        * refactor errors='ignore' to try except

        * refactored integer slice to iloc[]

        * moved to_numeric try except to function

        * Refactor to_numeric_ignore_errors to to_numeric_ignore_columns

        This change is slightly cleaner because it addresses the root issue that some
        columns are strings (and can therefore not be converted to numeric types). Now
        if an error does occur when converting the dfs to numeric types it won't be
        swallowed up.

        * Add documentation to to_nu…

Co-authored-by: mbowcut2 <[email protected]>
Co-authored-by: Trevor Martin <[email protected]>
Co-authored-by: Samuel Nichols <[email protected]>
kclem pushed a commit that referenced this issue Dec 13, 2024
* Mckay/halt on plot fail (#103)

* Mckay/c2pro reports test (#99)

* Fix CRISPRessoAggregate bug and other improvements (#95)

* D3-Enhancements (#78)

* Sam/try plots (#71)

* Fix batch mode pandas warning. (#70)

* refactor to call method on DataFrame, rather than Series.
Removes warning.

* Fix pandas future warning in CRISPRessoWGS

---------



* Functional

* Cole/fix status file name (#69)

* Update config file logging messages

This removes printing the exception (which is essentially a duplicate),
and adds a condition if no config file was provided. Also changes `json`
to `config` so that it is more clear.

* Fix divide by zero when no amplicons are present in Batch mode

* Don't append file_prefix to status file name

* Place status files in output directories

* Update tests branch for file_prefix addition

* Load D3 and plotly figures with pro with multiple amplicons

* Update batch

* Fix bug in CRISPRessoCompare with pointing to report datas with file_prefix

Before this fix, when using a file_prefix the second run that was compared
would not be displayed as a data in the first figure of the report.

* Import CRISPRessoPro instead of importing the version

When installed via conda, the version is not available

* Remove `get_amplicon_output` unused function from CRISPRessoCompare

Also remove unused argparse import

* Implement `get_matching_allele_files` in CRISPRessoCompare and accompanying unit tests

* Allow for matching of multiple guides in the same amplicon

* Fix pandas FutureWarning

* Change test branch back to master

---------



* Try catch all futures

* Fix test fail plots

* Point test to try-plots

* Fix d3 not showing and plotly mixing with matplotlib

* Use logger for warnings and debug statements

* Point tests back at master

---------




* Sam/fix plots (#72)

* Fix batch mode pandas warning. (#70)

* refactor to call method on DataFrame, rather than Series.
Removes warning.

* Fix pandas future warning in CRISPRessoWGS

---------



* Functional

* Cole/fix status file name (#69)

* Update config file logging messages

This removes printing the exception (which is essentially a duplicate),
and adds a condition if no config file was provided. Also changes `json`
to `config` so that it is more clear.

* Fix divide by zero when no amplicons are present in Batch mode

* Don't append file_prefix to status file name

* Place status files in output directories

* Update tests branch for file_prefix addition

* Load D3 and plotly figures with pro with multiple amplicons

* Update batch

* Fix bug in CRISPRessoCompare with pointing to report datas with file_prefix

Before this fix, when using a file_prefix the second run that was compared
would not be displayed as a data in the first figure of the report.

* Import CRISPRessoPro instead of importing the version

When installed via conda, the version is not available

* Remove `get_amplicon_output` unused function from CRISPRessoCompare

Also remove unused argparse import

* Implement `get_matching_allele_files` in CRISPRessoCompare and accompanying unit tests

* Allow for matching of multiple guides in the same amplicon

* Fix pandas FutureWarning

* Change test branch back to master

---------



* Try catch all futures

* Fix test fail plots

* Fix d3 not showing and plotly mixing with matplotlib

---------




* Remove token from integration tests file

* Provide sgRNA_sequences to plot_nucleotide_quilt plots

* Passing sgRNA_sequences to plot

* Refactor check for determining when to use CRISPREssoPro or matplotlib for Batch plots

* Add max-height to Batch report samples

* Change testing branch

* Fix wrong check for large Batch plots

* Fix typo and move flexiguide to debug (#77)

* Change flexiguide output to debug level

* Fix typo in fastp merged output file name

* Adding id tags for d3 script enhancements

* pointing to test branch

* Add amplicon_name parameter to allele heatmap and line plots

* Add function to extract quantification window regions from include_idxs

* Scale the quantification window according to the coordinates of the sgRNA plot

* added c2pro check, added space in args.json

* Correct the quantification window indexes for multiple guides

* Fix name of nucleotide conversion plot when guides are not the same

* Fix jinja variables that aren't found

* Fix multiple guide errors where the wrong sgRNA sequence was associated in d3 plot

* Remove unneeded variable and extra whitespace

* Switch test branch to master

---------





* Add amplicon_name to plot functions

* Add sgRNA sequences to nucleotide quilt parameters in Aggregate

* Add custom_colors to Aggregate plot functions

* Update Aggregate and make_aggregate_report to have logger and tool

* Write command_used to Aggregate .json info file

* Point to new test branch and add Aggregate run

* Make the order of Aggregate runs explicit

* Sort all instances of crispresso2_folder_info in Aggregate

* Sort df_summary_quantification df in Aggregate

* Try sorting with a list of single column

* Update to correct test branch

* Move to master test branch

---------





* Squashed commit of the following:

commit 6ec98a05ee70f85b5aa0ac15ab6094b7f1f20d08
Author: mbowcut2 <[email protected]>
Date:   Tue Aug 13 16:44:39 2024 -0600

    dict key changes

commit 7cfd5acf06da4eb6f49453144ee1fed1e1488a7a
Author: mbowcut2 <[email protected]>
Date:   Thu Aug 8 15:30:31 2024 -0600

    added C2PRO install check back

commit bfb0003329ea61b5c79c7e1df8d9a73ec5a508db
Author: mbowcut2 <[email protected]>
Date:   Fri Aug 2 13:08:12 2024 -0600

    fixed key error conditionals

commit 84444e7480605206cb3efa4a0db675c55e717304
Author: mbowcut2 <[email protected]>
Date:   Fri Aug 2 09:22:44 2024 -0600

    use local jinja_paritals file

commit 71dd12786fec6c4aba0170a3bfd9022b06f5eede
Author: mbowcut2 <[email protected]>
Date:   Wed Jul 31 14:10:29 2024 -0600

    Squashed commit of the following:

    commit 5e3b30515c4bc437127e7fb21f53cb0bd511c4ca
    Author: Trevor Martin <[email protected]>
    Date:   Mon Jul 22 09:31:44 2024 -0600

        D3-Enhancements (#78)

        * Sam/try plots (#71)

        * Fix batch mode pandas warning. (#70)

        * refactor to call method on DataFrame, rather than Series.
        Removes warning.

        * Fix pandas future warning in CRISPRessoWGS

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        * Functional

        * Cole/fix status file name (#69)

        * Update config file logging messages

        This removes printing the exception (which is essentially a duplicate),
        and adds a condition if no config file was provided. Also changes `json`
        to `config` so that it is more clear.

        * Fix divide by zero when no amplicons are present in Batch mode

        * Don't append file_prefix to status file name

        * Place status files in output directories

        * Update tests branch for file_prefix addition

        * Load D3 and plotly figures with pro with multiple amplicons

        * Update batch

        * Fix bug in CRISPRessoCompare with pointing to report datas with file_prefix

        Before this fix, when using a file_prefix the second run that was compared
        would not be displayed as a data in the first figure of the report.

        * Import CRISPRessoPro instead of importing the version

        When installed via conda, the version is not available

        * Remove `get_amplicon_output` unused function from CRISPRessoCompare

        Also remove unused argparse import

        * Implement `get_matching_allele_files` in CRISPRessoCompare and accompanying unit tests

        * Allow for matching of multiple guides in the same amplicon

        * Fix pandas FutureWarning

        * Change test branch back to master

        ---------

        Co-authored-by: Sam <[email protected]>

        * Try catch all futures

        * Fix test fail plots

        * Point test to try-plots

        * Fix d3 not showing and plotly mixing with matplotlib

        * Use logger for warnings and debug statements

        * Point tests back at master

        ---------

        Co-authored-by: mbowcut2 <[email protected]>
        Co-authored-by: Cole Lyman <[email protected]>

        * Sam/fix plots (#72)

        * Fix batch mode pandas warning. (#70)

        * refactor to call method on DataFrame, rather than Series.
        Removes warning.

        * Fix pandas future warning in CRISPRessoWGS

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        * Functional

        * Cole/fix status file name (#69)

        * Update config file logging messages

        This removes printing the exception (which is essentially a duplicate),
        and adds a condition if no config file was provided. Also changes `json`
        to `config` so that it is more clear.

        * Fix divide by zero when no amplicons are present in Batch mode

        * Don't append file_prefix to status file name

        * Place status files in output directories

        * Update tests branch for file_prefix addition

        * Load D3 and plotly figures with pro with multiple amplicons

        * Update batch

        * Fix bug in CRISPRessoCompare with pointing to report datas with file_prefix

        Before this fix, when using a file_prefix the second run that was compared
        would not be displayed as a data in the first figure of the report.

        * Import CRISPRessoPro instead of importing the version

        When installed via conda, the version is not available

        * Remove `get_amplicon_output` unused function from CRISPRessoCompare

        Also remove unused argparse import

        * Implement `get_matching_allele_files` in CRISPRessoCompare and accompanying unit tests

        * Allow for matching of multiple guides in the same amplicon

        * Fix pandas FutureWarning

        * Change test branch back to master

        ---------

        Co-authored-by: Sam <[email protected]>

        * Try catch all futures

        * Fix test fail plots

        * Fix d3 not showing and plotly mixing with matplotlib

        ---------

        Co-authored-by: mbowcut2 <[email protected]>
        Co-authored-by: Cole Lyman <[email protected]>

        * Remove token from integration tests file

        * Provide sgRNA_sequences to plot_nucleotide_quilt plots

        * Passing sgRNA_sequences to plot

        * Refactor check for determining when to use CRISPREssoPro or matplotlib for Batch plots

        * Add max-height to Batch report samples

        * Change testing branch

        * Fix wrong check for large Batch plots

        * Fix typo and move flexiguide to debug (#77)

        * Change flexiguide output to debug level

        * Fix typo in fastp merged output file name

        * Adding id tags for d3 script enhancements

        * pointing to test branch

        * Add amplicon_name parameter to allele heatmap and line plots

        * Add function to extract quantification window regions from include_idxs

        * Scale the quantification window according to the coordinates of the sgRNA plot

        * added c2pro check, added space in args.json

        * Correct the quantification window indexes for multiple guides

        * Fix name of nucleotide conversion plot when guides are not the same

        * Fix jinja variables that aren't found

        * Fix multiple guide errors where the wrong sgRNA sequence was associated in d3 plot

        * Remove unneeded variable and extra whitespace

        * Switch test branch to master

        ---------

        Co-authored-by: Samuel Nichols <[email protected]>
        Co-authored-by: mbowcut2 <[email protected]>
        Co-authored-by: Cole Lyman <[email protected]>

    commit 09e5d9720ad21e44fc7916d71bde3fd7a9dfa7ef
    Author: Kendell Clement <[email protected]>
    Date:   Thu Jul 18 14:31:54 2024 -0600

        Asymmetrical cut point (#457)

        * add cut_point_ind to plot_alleles_heatmap for asymmetrical plotting

        * Cole asymmetrical cut point (#453)

        * Pin versions of numpy and matplotlib in CI environment (#84) (#452)

        * Reduce duplication and implement cut_point_ind in plot_alleles_heatmap_hist

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

    commit 8d92972694ddff629dad844a6ad100459f69751d
    Author: Cole Lyman <[email protected]>
    Date:   Thu Jul 18 14:29:40 2024 -0600

        Cole/update args (#85) (#456)

    commit 44f692ecabf5e2eb96ee0cfd7bae62343da7810c
    Author: Cole Lyman <[email protected]>
    Date:   Mon Jul 15 16:17:29 2024 -0600

        Implement new pooled mixed-mode default behavior (#454)

        * changes for pooled mixed-mode default (#83)

        * changes for pooled mixed-mode default

        * deprecated old arg

        * added integration tests for mixed mode

        * fixed test target

        * updated test name

        * pinned numpy

        * Fix integration tests yml

        * pinning matplotlib

        * added print to CI tests

        * changed mixed mode info string

        * Remove pooled-mixed-mode-align-to-genome step from Github Actions

        * Update demultiplex_genome_wide parameter and help

        * Convert args.json to unix line endings

        * Add Pooled mixed mode demux run

        * Update the name of the argument in Pooled

        * Point integration tests back to master

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        * Revert change to pooled mixed mode info statement (#86)

        ---------

        Co-authored-by: mbowcut2 <[email protected]>

    commit 79b482b55a0e8edbc03ec22bd2714bade1e90323
    Author: Cole Lyman <[email protected]>
    Date:   Tue Jul 9 12:53:23 2024 -0600

        Pin versions of numpy and matplotlib in CI environment (#84) (#452)

    commit 80dc1bdd72d50f989717bfc5f8156bc3495c45f4
    Author: Kendell Clement <[email protected]>
    Date:   Thu May 30 14:07:42 2024 -0600

        Add padding to image

    commit 381755daf0939aaf2745df0a802c809633aff47d
    Author: Kendell Clement <[email protected]>
    Date:   Thu May 30 13:59:57 2024 -0600

        White background for schematic for dark mode

    commit d649db71e610bd8840fbb8d46fadb07789b67390
    Author: Cole Lyman <[email protected]>
    Date:   Fri May 24 12:45:53 2024 -0600

        Fix typo and move flexiguide to debug (#77) (#438)

        * Change flexiguide output to debug level

        * Fix typo in fastp merged output file name

    commit 71181f50ef2b39015523b1a71d9fd1bf0dce14eb
    Author: Cole Lyman <[email protected]>
    Date:   Mon May 13 13:34:00 2024 -0600

        Prefix the release Docker tag with a `v` (#434)

    commit d2c2be18a6bb64b0e742cc24c4665980a24324bc
    Author: Cole Lyman <[email protected]>
    Date:   Mon May 13 09:41:32 2024 -0600

        Showing sgRNA sequences on hover in CRISPRessoPro (#432)

        * Passing sgRNA sequences to regular and Batch D3 plots (#73)

        * Sam/try plots (#71)

        * Fix batch mode pandas warning. (#70)

        * refactor to call method on DataFrame, rather than Series.
        Removes warning.

        * Fix pandas future warning in CRISPRessoWGS

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        * Functional

        * Cole/fix status file name (#69)

        * Update config file logging messages

        This removes printing the exception (which is essentially a duplicate),
        and adds a condition if no config file was provided. Also changes `json`
        to `config` so that it is more clear.

        * Fix divide by zero when no amplicons are present in Batch mode

        * Don't append file_prefix to status file name

        * Place status files in output directories

        * Update tests branch for file_prefix addition

        * Load D3 and plotly figures with pro with multiple amplicons

        * Update batch

        * Fix bug in CRISPRessoCompare with pointing to report datas with file_prefix

        Before this fix, when using a file_prefix the second run that was compared
        would not be displayed as a data in the first figure of the report.

        * Import CRISPRessoPro instead of importing the version

        When installed via conda, the version is not available

        * Remove `get_amplicon_output` unused function from CRISPRessoCompare

        Also remove unused argparse import

        * Implement `get_matching_allele_files` in CRISPRessoCompare and accompanying unit tests

        * Allow for matching of multiple guides in the same amplicon

        * Fix pandas FutureWarning

        * Change test branch back to master

        ---------

        Co-authored-by: Sam <[email protected]>

        * Try catch all futures

        * Fix test fail plots

        * Point test to try-plots

        * Fix d3 not showing and plotly mixing with matplotlib

        * Use logger for warnings and debug statements

        * Point tests back at master

        ---------

        Co-authored-by: mbowcut2 <[email protected]>
        Co-authored-by: Cole Lyman <[email protected]>

        * Sam/fix plots (#72)

        * Fix batch mode pandas warning. (#70)

        * refactor to call method on DataFrame, rather than Series.
        Removes warning.

        * Fix pandas future warning in CRISPRessoWGS

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        * Functional

        * Cole/fix status file name (#69)

        * Update config file logging messages

        This removes printing the exception (which is essentially a duplicate),
        and adds a condition if no config file was provided. Also changes `json`
        to `config` so that it is more clear.

        * Fix divide by zero when no amplicons are present in Batch mode

        * Don't append file_prefix to status file name

        * Place status files in output directories

        * Update tests branch for file_prefix addition

        * Load D3 and plotly figures with pro with multiple amplicons

        * Update batch

        * Fix bug in CRISPRessoCompare with pointing to report datas with file_prefix

        Before this fix, when using a file_prefix the second run that was compared
        would not be displayed as a data in the first figure of the report.

        * Import CRISPRessoPro instead of importing the version

        When installed via conda, the version is not available

        * Remove `get_amplicon_output` unused function from CRISPRessoCompare

        Also remove unused argparse import

        * Implement `get_matching_allele_files` in CRISPRessoCompare and accompanying unit tests

        * Allow for matching of multiple guides in the same amplicon

        * Fix pandas FutureWarning

        * Change test branch back to master

        ---------

        Co-authored-by: Sam <[email protected]>

        * Try catch all futures

        * Fix test fail plots

        * Fix d3 not showing and plotly mixing with matplotlib

        ---------

        Co-authored-by: mbowcut2 <[email protected]>
        Co-authored-by: Cole Lyman <[email protected]>

        * Remove token from integration tests file

        * Provide sgRNA_sequences to plot_nucleotide_quilt plots

        * Passing sgRNA_sequences to plot

        * Refactor check for determining when to use CRISPREssoPro or matplotlib for Batch plots

        * Add max-height to Batch report samples

        * Change testing branch

        * Fix wrong check for large Batch plots

        * Update integration_tests.yml to point back at master

        ---------

        Co-authored-by: Samuel Nichols <[email protected]>
        Co-authored-by: mbowcut2 <[email protected]>
        Co-authored-by: Cole Lyman <[email protected]>

        * Push new releases to ECR (#74)

        * Create aws_ecr.yml (#1)

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * us-east-1

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Update aws_ecr.yml

        * Fix d3 sgRNA sequences (#76)

        * Pass correct sgRNA_sequences to d3 plot

        * Pass correct sgRNA sequence to prime editor plot for d3

        * Resize plotly (#75)

        * Sam/try plots (#71)

        * Fix batch mode pandas warning. (#70)

        * refactor to call method on DataFrame, rather than Series.
        Removes warning.

        * Fix pandas future warning in CRISPRessoWGS

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        * Functional

        * Cole/fix status file name (#69)

        * Update config file logging messages

        This removes printing the exception (which is essentially a duplicate),
        and adds a condition if no config file was provided. Also changes `json`
        to `config` so that it is more clear.

        * Fix divide by zero when no amplicons are present in Batch mode

        * Don't append file_prefix to status file name

        * Place status files in output directories

        * Update tests branch for file_prefix addition

        * Load D3 and plotly figures with pro with multiple amplicons

        * Update batch

        * Fix bug in CRISPRessoCompare with pointing to report datas with file_prefix

        Before this fix, when using a file_prefix the second run that was compared
        would not be displayed as a data in the first figure of the report.

        * Import CRISPRessoPro instead of importing the version

        When installed via conda, the version is not available

        * Remove `get_amplicon_output` unused function from CRISPRessoCompare

        Also remove unused argparse import

        * Implement `get_matching_allele_files` in CRISPRessoCompare and accompanying unit tests

        * Allow for matching of multiple guides in the same amplicon

        * Fix pandas FutureWarning

        * Change test branch back to master

        ---------

        Co-authored-by: Sam <[email protected]>

        * Try catch all futures

        * Fix test fail plots

        * Point test to try-plots

        * Fix d3 not showing and plotly mixing with matplotlib

        * Use logger for warnings and debug statements

        * Point tests back at master

        ---------

        Co-authored-by: mbowcut2 <[email protected]>
        Co-authored-by: Cole Lyman <[email protected]>

        * Sam/fix plots (#72)

        * Fix batch mode pandas warning. (#70)

        * refactor to call method on DataFrame, rather than Series.
        Removes warning.

        * Fix pandas future warning in CRISPRessoWGS

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        * Functional

        * Cole/fix status file name (#69)

        * Update config file logging messages

        This removes printing the exception (which is essentially a duplicate),
        and adds a condition if no config file was provided. Also changes `json`
        to `config` so that it is more clear.

        * Fix divide by zero when no amplicons are present in Batch mode

        * Don't append file_prefix to status file name

        * Place status files in output directories

        * Update tests branch for file_prefix addition

        * Load D3 and plotly figures with pro with multiple amplicons

        * Update batch

        * Fix bug in CRISPRessoCompare with pointing to report datas with file_prefix

        Before this fix, when using a file_prefix the second run that was compared
        would not be displayed as a data in the first figure of the report.

        * Import CRISPRessoPro instead of importing the version

        When installed via conda, the version is not available

        * Remove `get_amplicon_output` unused function from CRISPRessoCompare

        Also remove unused argparse import

        * Implement `get_matching_allele_files` in CRISPRessoCompare and accompanying unit tests

        * Allow for matching of multiple guides in the same amplicon

        * Fix pandas FutureWarning

        * Change test branch back to master

        ---------

        Co-authored-by: Sam <[email protected]>

        * Try catch all futures

        * Fix test fail plots

        * Fix d3 not showing and plotly mixing with matplotlib

        ---------

        Co-authored-by: mbowcut2 <[email protected]>
        Co-authored-by: Cole Lyman <[email protected]>

        * Remove token from integration tests file

        * Pass div id for plotly

        * Remove debug

        ---------

        Co-authored-by: mbowcut2 <[email protected]>
        Co-authored-by: Cole Lyman <[email protected]>

        ---------

        Co-authored-by: Trevor Martin <[email protected]>
        Co-authored-by: Samuel Nichols <[email protected]>
        Co-authored-by: mbowcut2 <[email protected]>

    commit 1c504274818b6b17fb60620d48fd92cb2e50566d
    Author: Cole Lyman <[email protected]>
    Date:   Thu May 9 14:16:25 2024 -0600

        Fix plots and improve plot error handling (#431)

        * Sam/try plots (#71)

        * Fix batch mode pandas warning. (#70)

        * refactor to call method on DataFrame, rather than Series.
        Removes warning.

        * Fix pandas future warning in CRISPRessoWGS

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        * Functional

        * Cole/fix status file name (#69)

        * Update config file logging messages

        This removes printing the exception (which is essentially a duplicate),
        and adds a condition if no config file was provided. Also changes `json`
        to `config` so that it is more clear.

        * Fix divide by zero when no amplicons are present in Batch mode

        * Don't append file_prefix to status file name

        * Place status files in output directories

        * Update tests branch for file_prefix addition

        * Load D3 and plotly figures with pro with multiple amplicons

        * Update batch

        * Fix bug in CRISPRessoCompare with pointing to report datas with file_prefix

        Before this fix, when using a file_prefix the second run that was compared
        would not be displayed as a data in the first figure of the report.

        * Import CRISPRessoPro instead of importing the version

        When installed via conda, the version is not available

        * Remove `get_amplicon_output` unused function from CRISPRessoCompare

        Also remove unused argparse import

        * Implement `get_matching_allele_files` in CRISPRessoCompare and accompanying unit tests

        * Allow for matching of multiple guides in the same amplicon

        * Fix pandas FutureWarning

        * Change test branch back to master

        ---------

        Co-authored-by: Sam <[email protected]>

        * Try catch all futures

        * Fix test fail plots

        * Point test to try-plots

        * Fix d3 not showing and plotly mixing with matplotlib

        * Use logger for warnings and debug statements

        * Point tests back at master

        ---------

        Co-authored-by: mbowcut2 <[email protected]>
        Co-authored-by: Cole Lyman <[email protected]>

        * Sam/fix plots (#72)

        * Fix batch mode pandas warning. (#70)

        * refactor to call method on DataFrame, rather than Series.
        Removes warning.

        * Fix pandas future warning in CRISPRessoWGS

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        * Functional

        * Cole/fix status file name (#69)

        * Update config file logging messages

        This removes printing the exception (which is essentially a duplicate),
        and adds a condition if no config file was provided. Also changes `json`
        to `config` so that it is more clear.

        * Fix divide by zero when no amplicons are present in Batch mode

        * Don't append file_prefix to status file name

        * Place status files in output directories

        * Update tests branch for file_prefix addition

        * Load D3 and plotly figures with pro with multiple amplicons

        * Update batch

        * Fix bug in CRISPRessoCompare with pointing to report datas with file_prefix

        Before this fix, when using a file_prefix the second run that was compared
        would not be displayed as a data in the first figure of the report.

        * Import CRISPRessoPro instead of importing the version

        When installed via conda, the version is not available

        * Remove `get_amplicon_output` unused function from CRISPRessoCompare

        Also remove unused argparse import

        * Implement `get_matching_allele_files` in CRISPRessoCompare and accompanying unit tests

        * Allow for matching of multiple guides in the same amplicon

        * Fix pandas FutureWarning

        * Change test branch back to master

        ---------

        Co-authored-by: Sam <[email protected]>

        * Try catch all futures

        * Fix test fail plots

        * Fix d3 not showing and plotly mixing with matplotlib

        ---------

        Co-authored-by: mbowcut2 <[email protected]>
        Co-authored-by: Cole Lyman <[email protected]>

        * Remove token from integration tests file

        ---------

        Co-authored-by: Samuel Nichols <[email protected]>
        Co-authored-by: mbowcut2 <[email protected]>

    commit acb2ea8e26dff4cd11f71301b344f81b1cec9040
    Author: Kendell Clement <[email protected]>
    Date:   Thu May 2 13:49:33 2024 -0600

        Use recent docker image for CircleCI testing that includes updated pandas

    commit 38fd76dbd7ce2087468f9f454b548777de959a68
    Author: Cole Lyman <[email protected]>
    Date:   Wed May 1 16:42:28 2024 -0600

        Cole/fix status file name (#69) (#430)

        * Update config file logging messages

        This removes printing the exception (which is essentially a duplicate),
        and adds a condition if no config file was provided. Also changes `json`
        to `config` so that it is more clear.

        * Fix divide by zero when no amplicons are present in Batch mode

        * Don't append file_prefix to status file name

        * Place status files in output directories

        * Update tests branch for file_prefix addition

        * Load D3 and plotly figures with pro with multiple amplicons

        * Update batch

        * Fix bug in CRISPRessoCompare with pointing to report datas with file_prefix

        Before this fix, when using a file_prefix the second run that was compared
        would not be displayed as a data in the first figure of the report.

        * Import CRISPRessoPro instead of importing the version

        When installed via conda, the version is not available

        * Remove `get_amplicon_output` unused function from CRISPRessoCompare

        Also remove unused argparse import

        * Implement `get_matching_allele_files` in CRISPRessoCompare and accompanying unit tests

        * Allow for matching of multiple guides in the same amplicon

        * Fix pandas FutureWarning

        * Change test branch back to master

        ---------

        Co-authored-by: Sam <[email protected]>

    commit 3ec22e5fd09e432c9997d30e5f9ee51a2cc00d7b
    Author: Kendell Clement <[email protected]>
    Date:   Wed May 1 13:08:11 2024 -0600

        Remove linked space in readme

    commit 340a4e16795a5e500411e11572ec267525985009
    Author: Cole Lyman <[email protected]>
    Date:   Wed May 1 13:07:14 2024 -0600

        Fix batch mode pandas warning. (#70) (#429)

        * refactor to call method on DataFrame, rather than Series.
        Removes warning.

        * Fix pandas future warning in CRISPRessoWGS

        ---------

        Co-authored-by: mbowcut2 <[email protected]>

    commit 1bc9e906f0ded81f80761d1ec375ee50a4f882a9
    Author: Cole Lyman <[email protected]>
    Date:   Fri Apr 26 16:26:27 2024 -0600

        Bump version to 2.3.1 and change default CRISPRessoPooled behavior to change in 2.3.2 (#428)

    commit 5638a1f6ffa973231f23422e9c757fa8cd4af7cc
    Author: Kendell Clement <[email protected]>
    Date:   Wed Apr 24 18:00:43 2024 -0600

        Spelling fixes

    commit d6011f29db16d8fc1c1e7222457b7f9a1f671de6
    Author: Cole Lyman <[email protected]>
    Date:   Wed Apr 24 09:33:53 2024 -0600

        Extract `jinja_partials` and fix CRISPRessoPooled fastp errors (#425)

        * Updated README (#64)

        * Updating README to fix argument, email, and formatting

        * removing superfluous files

        * Add link to CRISPRessoPro, move CRISPRessoPro section to end, and fix JSON formatting

        * Remove link to CRISPRessoPro

        * Replace Docker badge with link to tags

        * Add bullet points to Guardrails section and improve formatting

        * Fix typo and removed colons from guardrails

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        * Extract jinja_partials  (#65)

        * Extract jinja_partials code

        * Remove Plotly dependency from setup.py

        * Fix CRISPRessoPooled flash errors (#68)

        * Fix replacing flash intermediate files with fastp intermediate files

        This also moves where the files are added to `files_to_remove` up to
        near where they are created.

        * Update to run test branch with paired end Pooled test

        * Add pooled-paired-sim test to integration tests

        * Replace flash and trimmomatic with fastp and remove plotly from Github Actions environment

        * Change test branch back to master

        ---------

        Co-authored-by: Trevor Martin <[email protected]>

    commit f4858a30c43374f54058b3ad9c1e965e1ab7fb46
    Author: Cole Lyman <[email protected]>
    Date:   Tue Apr 23 17:00:28 2024 -0600

        Updated README (#64) (#424)

        * Updating README to fix argument, email, and formatting

        * removing superfluous files

        * Add link to CRISPRessoPro, move CRISPRessoPro section to end, and fix JSON formatting

        * Remove link to CRISPRessoPro

        * Replace Docker badge with link to tags

        * Add bullet points to Guardrails section and improve formatting

        * Fix typo and removed colons from guardrails

        ---------

        Co-authored-by: Trevor Martin <[email protected]>

    commit c3dbff0fccd44b0b1a9c246dd2aa629ddc515787
    Author: Kendell Clement <[email protected]>
    Date:   Mon Apr 22 11:24:59 2024 -0600

        Update CRISPRessoPooledCORE.py (#423)

        Fix bug in error reporting if duplicate names are present

    commit 20903c14877e5166b1b8a7b50b8fcab450ea3ca6
    Author: Cole Lyman <[email protected]>
    Date:   Thu Apr 18 16:55:39 2024 -0600

        Remove extra imports from CRISPRessoCore (#67) (#422)

    commit 4aae57e5be475cd717792265bee36a71a99425de
    Author: Cole Lyman <[email protected]>
    Date:   Thu Apr 18 10:00:19 2024 -0600

        Cole/refactor jinja undefined (#66) (#421)

        * Replace Jinja2 PackageLoader with FileSystemLoader

        The PackageLoader doesn't work with a fairly recent version of Jinja2 (3.0.1)
        and Python 3.9. Replacing with FileSystemLoader work with the older version and
        the latest version.

        * Fix undefined variable `amplicon_name` in report template

        * Refactor logging Jinja2 undefined variable warnings

        * Revert plot_11a update

        * Update intedration test branch

        * Update jinja to warn on undefined but not fail. Fix all undefined warnings

        * Fix github integration tests ref

        * One more undefined variable

        ---------

        Co-authored-by: Samuel Nichols <[email protected]>

    commit 768c3c05bf1786a2a32e135b6e145cd6503c3db1
    Author: Cole Lyman <[email protected]>
    Date:   Tue Apr 9 17:30:10 2024 -0600

        Fix Jinja2 undefined variables (#63) (#417)

        * Replace Jinja2 PackageLoader with FileSystemLoader

        The PackageLoader doesn't work with a fairly recent version of Jinja2 (3.0.1)
        and Python 3.9. Replacing with FileSystemLoader work with the older version and
        the latest version.

        * Fix undefined variable `amplicon_name` in report template

        * Revert plot_11a update

        * Update intedration test branch

        * Update branch for integration tests

    commit 7e18f08cc1ac5f247a0fd1bbb394ccd9b0a07c2e
    Author: Han Dai <[email protected]>
    Date:   Fri Apr 5 18:36:41 2024 -0400

        fix: change all U+00A0 to U+0020 (#400)

    commit 235dc29c0cd0fcca2e999148d4660acf00b07221
    Author: Cole Lyman <[email protected]>
    Date:   Fri Apr 5 16:36:16 2024 -0600

        Fastp, args as data, guardrails, and PE fix (#415)

        * Change CRISPResso_status.txt format to JSON (#46)

        * Bug Fix - 367 (#35)

        * - Fixed references to ref_names_for_pe

        * removed extra tabs

        * trying to match empty line, no tabs

        * - changed references to ref_names[0]

        * Mckay/pd warnings (#45)

        * refactor errors='ignore' to try except

        * refactored integer slice to iloc[]

        * moved to_numeric try except to function

        * Refactor to_numeric_ignore_errors to to_numeric_ignore_columns

        This change is slightly cleaner because it addresses the root issue that some
        columns are strings (and can therefore not be converted to numeric types). Now
        if an error does occur when converting the dfs to numeric types it won't be
        swallowed up.

        * Add documentation to to_numeric_ignore_columns

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        * add json read for status file

        * changed Formatter to json format

        * fixed json access variable name: message

        * changed  perentage_complete to numeric

        * changed status file to .json

        * Create integration_tests.yml

        * Simplify name

        * CRISPRESSO2_DIR environment variable

        * Up one dir

        * ls workspace

        * Install CRISPResso and ydiff

        * Clone repo instead of checkout

        * submodule

        * ls

        * CRISPResso2_copy

        * ls

        * Update env

        * Simplify

        * Pull from githubactions branch

        * Pull githubactions repo

        * Checkout githubactions

        * Run tests individually

        * Pin plotly version

        * Run all tests even if one fails

        * Test on another branch

        * Switch branch with token

        * Update integration_tests.yml

        * New makefile commands

        * changed file to .json

        * changed status to json file

        * Make JSON human readable by adding new lines

        * GitHub actions integration tests (#48)

        * GitHub actions clean (#40)

        * Create pytest.yml

        * Create pylint.yml

        * Create .pylintrc

        * Create test_env.yml

        * Full path

        * Remove conda install

        * Replace path

        * Pytest tests

        * pip -e

        * Create integration_tests.yml

        * Simplify name

        * CRISPRESSO2_DIR environment variable

        * Up one dir

        * ls workspace

        * Install CRISPResso and ydiff

        * Clone repo instead of checkout

        * submodule

        * ls

        * CRISPResso2_copy

        * ls

        * Update env

        * Simplify

        * Pull from githubactions branch

        * Pull githubactions repo

        * Checkout githubactions

        * Mckay/pd warnings (#45)

        * refactor errors='ignore' to try except

        * refactored integer slice to iloc[]

        * moved to_numeric try except to function

        * Refactor to_numeric_ignore_errors to to_numeric_ignore_columns

        This change is slightly cleaner because it addresses the root issue that some
        columns are strings (and can therefore not be converted to numeric types). Now
        if an error does occur when converting the dfs to numeric types it won't be
        swallowed up.

        * Add documentation to to_numeric_ignore_columns

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        * Run tests individually

        * Pin plotly version

        * Run all tests even if one fails

        * Test on another branch

        * Switch branch with token

        * Update integration_tests.yml

        * Introduce pandas sorting in CRISPRessoCompare (#47)

        * New makefile commands

        * Fix interleaved fastq input in CRISPRessoPooled and suppress CRISPRessoWGS params (#42)

        * Extract out split_interleaved_fastq function to CRISPRessoShared

        * Implement splitting interleaved fastq files in CRISPRessoPooled

        * Suppress split_interleaved_input from CRISPRessoWGS parameters

        * Suppress other parameters in CRISPRessoWGS

        * Move where interleaved fastq files are split to be trimmed properly

        * Bug Fix - 367 (#35)

        * - Fixed references to ref_names_for_pe

        * removed extra tabs

        * trying to match empty line, no tabs

        * - changed references to ref_names[0]

        * Mckay/pd warnings (#45)

        * refactor errors='ignore' to try except

        * refactored integer slice to iloc[]

        * moved to_numeric try except to function

        * Refactor to_numeric_ignore_errors to to_numeric_ignore_columns

        This change is slightly cleaner because it addresses the root issue that some
        columns are strings (and can therefore not be converted to numeric types). Now
        if an error does occur when converting the dfs to numeric types it won't be
        swallowed up.

        * Add documentation to to_numeric_ignore_columns

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        * On push no branches

        * On push no branches

        * All in one file

        * Fix yml errors

        * Rename jobs

        * Remove old workflow files

        * Remove paths

        * Run jobs in parallel

        ---------

        Co-authored-by: mbowcut2 <[email protected]>
        Co-authored-by: Cole Lyman <[email protected]>

        * Move read filtering to after merging in CRISPResso (#39)

        * Move read filtering to after merging

        This is in an effort to be consistent with the behavior and results of
        CRISPRessoPooled.

        * Properly assign the correct file names for read filtering

        * Add space around operators

        * GitHub actions on pr (#51)

        * Run integration tests on pull_request

        * Run pytest on pull_request

        * Run pylint on pull_request

        * Run tests on PR only when opening PR (#53)

        * Update reports (#52)

        * Update report changes

        * Switch branch of integration test repo

        * Remove extraneous `crispresso_data_path`

        * Point integration tests back to master

        * point to test branch

        * pointed CI config to testing branch

        * Update integration_tests.yml

        point to master

        ---------

        Co-authored-by: Cole Lyman <[email protected]>
        Co-authored-by: Samuel Nichols <[email protected]>

        * Trevor/fastp integration (#50)

        * Update check_program to check versions and create check_fastq function

        * Update fastq arg, implement fastp in get_most_frequent_reads

        * Bump version to 2.3.0

        * Deprecate Flash and Trimmomatic parameters, and update fastp params

        * Update guess_amplicons and guess_guides to remove max_paired_end_reads_overlap

        * Implement trimming of single end reads

        * Merge (and trim) reads in CRISPRessoCORE with fastp

        * Modify error handling to account for fastp errors

        * Replace flash and trimmomatic with fastp in Docker dependencies

        * Update LICENSE.txt with fastp info

        * Remove min and max amplicon length (no longer needed)

        * Mckay/pd warnings (#45)

        * refactor errors='ignore' to try except

        * refactored integer slice to iloc[]

        * moved to_numeric try except to function

        * Refactor to_numeric_ignore_errors to to_numeric_ignore_columns

        This change is slightly cleaner because it addresses the root issue that some
        columns are strings (and can therefore not be converted to numeric types). Now
        if an error does occur when converting the dfs to numeric types it won't be
        swallowed up.

        * Add documentation to to_numeric_ignore_columns

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        * Implement trimming with fastp in CRISPRessoPooled

        * Implemend merging (and trimming) with fastp in CRISPRessoPooled

        * Fixed minor fastp errors

        * Move read filtering to after merging in CRISPResso (#39)

        * Move read filtering to after merging

        This is in an effort to be consistent with the behavior and results of
        CRISPRessoPooled.

        * Properly assign the correct file names for read filtering

        * Add space around operators

        * GitHub actions on pr (#51)

        * Run integration tests on pull_request

        * Run pytest on pull_request

        * Run pylint on pull_request

        * Run tests on PR only when opening PR (#53)

        * Update reports (#52)

        * Update report changes

        * Switch branch of integration test repo

        * Remove extraneous `crispresso_data_path`

        * Point integration tests back to master

        * Update where the test point to

        * Fix 'Prime-edited' key not found (#32)

        * Move 'Prime-edited' amplicon name check

        By moving this, it will check if there is an amplicon named
        'Prime-edited' (which is a reserved name) even if the
        `prime_editing_pegRNA_extension_seq` parameter is empty.

        * Only search for scaffold integration when pegRNA extension seq is provided

        * Remove spaces at the end of lines

        * Docker size (#49)

        * Bug Fix - 367 (#35)

        * - Fixed references to ref_names_for_pe

        * removed extra tabs

        * trying to match empty line, no tabs

        * - changed references to ref_names[0]

        * Mckay/pd warnings (#45)

        * refactor errors='ignore' to try except

        * refactored integer slice to iloc[]

        * moved to_numeric try except to function

        * Refactor to_numeric_ignore_errors to to_numeric_ignore_columns

        This change is slightly cleaner because it addresses the root issue that some
        columns are strings (and can therefore not be converted to numeric types). Now
        if an error does occur when converting the dfs to numeric types it won't be
        swallowed up.

        * Add documentation to to_numeric_ignore_columns

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        * GitHub actions integration tests (#48)

        * GitHub actions clean (#40)

        * Create pytest.yml

        * Create pylint.yml

        * Create .pylintrc

        * Create test_env.yml

        * Full path

        * Remove conda install

        * Replace path

        * Pytest tests

        * pip -e

        * Create integration_tests.yml

        * Simplify name

        * CRISPRESSO2_DIR environment variable

        * Up one dir

        * ls workspace

        * Install CRISPResso and ydiff

        * Clone repo instead of checkout

        * submodule

        * ls

        * CRISPResso2_copy

        * ls

        * Update env

        * Simplify

        * Pull from githubactions branch

        * Pull githubactions repo

        * Checkout githubactions

        * Mckay/pd warnings (#45)

        * refactor errors='ignore' to try except

        * refactored integer slice to iloc[]

        * moved to_numeric try except to function

        * Refactor to_numeric_ignore_errors to to_numeric_ignore_columns

        This change is slightly cleaner because it addresses the root issue that some
        columns are strings (and can therefore not be converted to numeric types). Now
        if an error does occur when converting the dfs to numeric types it won't be
        swallowed up.

        * Add documentation to to_numeric_ignore_columns

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        * Run tests individually

        * Pin plotly version

        * Run all tests even if one fails

        * Test on another branch

        * Switch branch with token

        * Update integration_tests.yml

        * Introduce pandas sorting in CRISPRessoCompare (#47)

        * New makefile commands

        * Fix interleaved fastq input in CRISPRessoPooled and suppress CRISPRessoWGS params (#42)

        * Extract out split_interleaved_fastq function to CRISPRessoShared

        * Implement splitting interleaved fastq files in CRISPRessoPooled

        * Suppress split_interleaved_input from CRISPRessoWGS parameters

        * Suppress other parameters in CRISPRessoWGS

        * Move where interleaved fastq files are split to be trimmed properly

        * Bug Fix - 367 (#35)

        * - Fixed references to ref_names_for_pe

        * removed extra tabs

        * trying to match empty line, no tabs

        * - changed references to ref_names[0]

        * Mckay/pd warnings (#45)

        * refactor errors='ignore' to try except

        * refactored integer slice to iloc[]

        * moved to_numeric try except to function

        * Refactor to_numeric_ignore_errors to to_numeric_ignore_columns

        This change is slightly cleaner because it addresses the root issue that some
        columns are strings (and can therefore not be converted to numeric types). Now
        if an error does occur when converting the dfs to numeric types it won't be
        swallowed up.

        * Add documentation to to_numeric_ignore_columns

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        * On push no branches

        * On push no branches

        * All in one file

        * Fix yml errors

        * Rename jobs

        * Remove old workflow files

        * Remove paths

        * Run jobs in parallel

        ---------

        Co-authored-by: mbowcut2 <[email protected]>
        Co-authored-by: Cole Lyman <[email protected]>

        * 3.4->2.08

        * Put ttf-mscorefonts-installer back above apt-get clean

        * restore slash, replace fastp with trimmomatic and flash, add autoremove step

        ---------

        Co-authored-by: mbowcut2 <[email protected]>
        Co-authored-by: Cole Lyman <[email protected]>

        * initial readme modifications

        * Updated readme to remove deprecated commands, updated help text to reflect new version and fastp

        * Pointing test branch back at master

        ---------

        Co-authored-by: Cole Lyman <[email protected]>
        Co-authored-by: mbowcut2 <[email protected]>
        Co-authored-by: Samuel Nichols <[email protected]>

        * Guardrails clean history (#34)

        * Include guardrail functions

        * Add CRISPRessoReports subtree

        * Refactor to use CRISPRessoReports module

        * Include guardrail functions

        * Functional guardrails, needs reports update

        * Add guardrail partial

        * fix guardrials partial

        * Bug Fix - 367 (#35)

        * - Fixed references to ref_names_for_pe

        * removed extra tabs

        * trying to match empty line, no tabs

        * - changed references to ref_names[0]

        * Mckay/pd warnings (#45)

        * refactor errors='ignore' to try except

        * refactored integer slice to iloc[]

        * moved to_numeric try except to function

        * Refactor to_numeric_ignore_errors to to_numeric_ignore_columns

        This change is slightly cleaner because it addresses the root issue that some
        columns are strings (and can therefore not be converted to numeric types). Now
        if an error does occur when converting the dfs to numeric types it won't be
        swallowed up.

        * Add documentation to to_numeric_ignore_columns

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        * GitHub actions integration tests (#48)

        * GitHub actions clean (#40)

        * Create pytest.yml

        * Create pylint.yml

        * Create .pylintrc

        * Create test_env.yml

        * Full path

        * Remove conda install

        * Replace path

        * Pytest tests

        * pip -e

        * Create integration_tests.yml

        * Simplify name

        * CRISPRESSO2_DIR environment variable

        * Up one dir

        * ls workspace

        * Install CRISPResso and ydiff

        * Clone repo instead of checkout

        * submodule

        * ls

        * CRISPResso2_copy

        * ls

        * Update env

        * Simplify

        * Pull from githubactions branch

        * Pull githubactions repo

        * Checkout githubactions

        * Mckay/pd warnings (#45)

        * refactor errors='ignore' to try except

        * refactored integer slice to iloc[]

        * moved to_numeric try except to function

        * Refactor to_numeric_ignore_errors to to_numeric_ignore_columns

        This change is slightly cleaner because it addresses the root issue that some
        columns are strings (and can therefore not be converted to numeric types). Now
        if an error does occur when converting the dfs to numeric types it won't be
        swallowed up.

        * Add documentation to to_numeric_ignore_columns

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        * Run tests individually

        * Pin plotly version

        * Run all tests even if one fails

        * Test on another branch

        * Switch branch with token

        * Update integration_tests.yml

        * Introduce pandas sorting in CRISPRessoCompare (#47)

        * New makefile commands

        * Fix interleaved fastq input in CRISPRessoPooled and suppress CRISPRessoWGS params (#42)

        * Extract out split_interleaved_fastq function to CRISPRessoShared

        * Implement splitting interleaved fastq files in CRISPRessoPooled

        * Suppress split_interleaved_input from CRISPRessoWGS parameters

        * Suppress other parameters in CRISPRessoWGS

        * Move where interleaved fastq files are split to be trimmed properly

        * Bug Fix - 367 (#35)

        * - Fixed references to ref_names_for_pe

        * removed extra tabs

        * trying to match empty line, no tabs

        * - changed references to ref_names[0]

        * Mckay/pd warnings (#45)

        * refactor errors='ignore' to try except

        * refactored integer slice to iloc[]

        * moved to_numeric try except to function

        * Refactor to_numeric_ignore_errors to to_numeric_ignore_columns

        This change is slightly cleaner because it addresses the root issue that some
        columns are strings (and can therefore not be converted to numeric types). Now
        if an error does occur when converting the dfs to numeric types it won't be
        swallowed up.

        * Add documentation to to_numeric_ignore_columns

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        * On push no branches

        * On push no branches

        * All in one file

        * Fix yml errors

        * Rename jobs

        * Remove old workflow files

        * Remove paths

        * Run jobs in parallel

        ---------

        Co-authored-by: mbowcut2 <[email protected]>
        Co-authored-by: Cole Lyman <[email protected]>

        * Update C cythonized files

        * Add exact numbers to guardrails printouts

        * Remove extraneous whitespace from CRISPRessoCOREResources.pyx

        * Fix calculation of `total_mods` from being negative

        The issue was that `all_deletion_coordinates` just tells you how many deletions
        were present, but not how long the deletion is.

        * Changes to message

        * Remove old tag

        * Point tests at guardrails

        * Restore C2 pro check

        * Save message with guardrail name

        * Point tests repo at master

        ---------

        Co-authored-by: Cole Lyman <[email protected]>
        Co-authored-by: mbowcut2 <[email protected]>

        * Fix case sensitivity in Prime Editing mode (#54)

        * Move read filtering to after merging in CRISPResso (#39)

        * Move read filtering to after merging

        This is in an effort to be consistent with the behavior and results of
        CRISPRessoPooled.

        * Properly assign the correct file names for read filtering

        * Add space around operators

        * GitHub actions on pr (#51)

        * Run integration tests on pull_request

        * Run pytest on pull_request

        * Run pylint on pull_request

        * Run tests on PR only when opening PR (#53)

        * Update reports (#52)

        * Update report changes

        * Switch branch of integration test repo

        * Remove extraneous `crispresso_data_path`

        * Point integration tests back to master

        * Make all amplicons in amplicon_seq_arr uppercase

        This fixes https://github.com/pinellolab/CRISPResso2/issues/396

        * Allow RNA values to be provided for prime_editing_pegRNA_scaffold_seq

        * Fix 'Prime-edited' key not found (#32)

        * Move 'Prime-edited' amplicon name check

        By moving this, it will check if there is an amplicon named
        'Prime-edited' (which is a reserved name) even if the
        `prime_editing_pegRNA_extension_seq` parameter is empty.

        * Only search for scaffold integration when pegRNA extension seq is provided

        * Remove spaces at the end of lines

        * Docker size (#49)

        * Bug Fix - 367 (#35)

        * - Fixed references to ref_names_for_pe

        * removed extra tabs

        * trying to match empty line, no tabs

        * - changed references to ref_names[0]

        * Mckay/pd warnings (#45)

        * refactor errors='ignore' to try except

        * refactored integer slice to iloc[]

        * moved to_numeric try except to function

        * Refactor to_numeric_ignore_errors to to_numeric_ignore_columns

        This change is slightly cleaner because it addresses the root issue that some
        columns are strings (and can therefore not be converted to numeric types). Now
        if an error does occur when converting the dfs to numeric types it won't be
        swallowed up.

        * Add documentation to to_numeric_ignore_columns

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        ---------

        Co-authored-by: Cole Lyman <[email protected]>

        * GitHub actions integration tests (#48)

        * GitHub actions clean (#40)

        * Create pytest.yml

        * Create pylint.yml

        * Create .pylintrc

        * Create test_env.yml

        * Full path

        * Remove conda install

        * Replace path

        * Pytest tests

        * pip -e

        * Create integration_tests.yml

        * Simplify name

        * CRISPRESSO2_DIR environment variable

        * Up one dir

        * ls workspace

        * Install CRISPResso and ydiff

        * Clone repo instead of checkout

        * submodule

        * ls

        * CRISPResso2_copy

        * ls

        * Update env

        * Simplify

        * Pull from githubactions branch

        * Pull githubactions repo

        * Checkout githubactions

        * Mckay/pd warnings (#45)

        * refactor errors='ignore' to try except

        * refactored integer slice to iloc[]

        * moved to_numeric try except to function

        * Refactor to_numeric_ignore_errors to to_numeric_ignore_columns

        This change is slightly cleaner because it addresses the root issue that some
        columns are strings (and can therefore not be converted to numeric types). Now
        if an error does occur when converting the dfs to numeric types it won't be
        swallowed up.

…

Co-authored-by: mbowcut2 <[email protected]>
Co-authored-by: Trevor Martin <[email protected]>
Co-authored-by: Samuel Nichols <[email protected]>
Co-authored-by: Trevor Martin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants