Skip to content

Commit

Permalink
D3-Enhancements (#78) (pinellolab#459)
Browse files Browse the repository at this point in the history
* Sam/try plots (#71)

* Fix batch mode pandas warning. (#70)

* refactor to call method on DataFrame, rather than Series.
Removes warning.

* Fix pandas future warning in CRISPRessoWGS

---------



* Functional

* Cole/fix status file name (#69)

* Update config file logging messages

This removes printing the exception (which is essentially a duplicate),
and adds a condition if no config file was provided. Also changes `json`
to `config` so that it is more clear.

* Fix divide by zero when no amplicons are present in Batch mode

* Don't append file_prefix to status file name

* Place status files in output directories

* Update tests branch for file_prefix addition

* Load D3 and plotly figures with pro with multiple amplicons

* Update batch

* Fix bug in CRISPRessoCompare with pointing to report datas with file_prefix

Before this fix, when using a file_prefix the second run that was compared
would not be displayed as a data in the first figure of the report.

* Import CRISPRessoPro instead of importing the version

When installed via conda, the version is not available

* Remove `get_amplicon_output` unused function from CRISPRessoCompare

Also remove unused argparse import

* Implement `get_matching_allele_files` in CRISPRessoCompare and accompanying unit tests

* Allow for matching of multiple guides in the same amplicon

* Fix pandas FutureWarning

* Change test branch back to master

---------



* Try catch all futures

* Fix test fail plots

* Point test to try-plots

* Fix d3 not showing and plotly mixing with matplotlib

* Use logger for warnings and debug statements

* Point tests back at master

---------




* Sam/fix plots (#72)

* Fix batch mode pandas warning. (#70)

* refactor to call method on DataFrame, rather than Series.
Removes warning.

* Fix pandas future warning in CRISPRessoWGS

---------



* Functional

* Cole/fix status file name (#69)

* Update config file logging messages

This removes printing the exception (which is essentially a duplicate),
and adds a condition if no config file was provided. Also changes `json`
to `config` so that it is more clear.

* Fix divide by zero when no amplicons are present in Batch mode

* Don't append file_prefix to status file name

* Place status files in output directories

* Update tests branch for file_prefix addition

* Load D3 and plotly figures with pro with multiple amplicons

* Update batch

* Fix bug in CRISPRessoCompare with pointing to report datas with file_prefix

Before this fix, when using a file_prefix the second run that was compared
would not be displayed as a data in the first figure of the report.

* Import CRISPRessoPro instead of importing the version

When installed via conda, the version is not available

* Remove `get_amplicon_output` unused function from CRISPRessoCompare

Also remove unused argparse import

* Implement `get_matching_allele_files` in CRISPRessoCompare and accompanying unit tests

* Allow for matching of multiple guides in the same amplicon

* Fix pandas FutureWarning

* Change test branch back to master

---------



* Try catch all futures

* Fix test fail plots

* Fix d3 not showing and plotly mixing with matplotlib

---------




* Remove token from integration tests file

* Provide sgRNA_sequences to plot_nucleotide_quilt plots

* Passing sgRNA_sequences to plot

* Refactor check for determining when to use CRISPREssoPro or matplotlib for Batch plots

* Add max-height to Batch report samples

* Change testing branch

* Fix wrong check for large Batch plots

* Fix typo and move flexiguide to debug (#77)

* Change flexiguide output to debug level

* Fix typo in fastp merged output file name

* Adding id tags for d3 script enhancements

* pointing to test branch

* Add amplicon_name parameter to allele heatmap and line plots

* Add function to extract quantification window regions from include_idxs

* Scale the quantification window according to the coordinates of the sgRNA plot

* added c2pro check, added space in args.json

* Correct the quantification window indexes for multiple guides

* Fix name of nucleotide conversion plot when guides are not the same

* Fix jinja variables that aren't found

* Fix multiple guide errors where the wrong sgRNA sequence was associated in d3 plot

* Remove unneeded variable and extra whitespace

* Switch test branch to master

---------

Co-authored-by: Trevor Martin <[email protected]>
Co-authored-by: Samuel Nichols <[email protected]>
Co-authored-by: mbowcut2 <[email protected]>
  • Loading branch information
4 people authored Aug 1, 2024
1 parent 09e5d97 commit 3f62643
Show file tree
Hide file tree
Showing 7 changed files with 82 additions and 18 deletions.
18 changes: 12 additions & 6 deletions CRISPResso2/CRISPRessoBatchCORE.py
Original file line number Diff line number Diff line change
Expand Up @@ -610,8 +610,8 @@ def main():
if not args.suppress_plots and not args.suppress_batch_summary_plots and should_plot_large_plots(sub_nucleotide_percentage_summary_df.shape[0], C2PRO_INSTALLED, args.use_matplotlib):
# plot for each guide
# show all sgRNA's on the plot
sub_sgRNA_intervals = []
for sgRNA_interval in consensus_sgRNA_intervals:
sub_sgRNA_intervals, sub_consensus_guides = [], []
for sgRNA_index, sgRNA_interval in enumerate(consensus_sgRNA_intervals):
newstart = None
newend = None
for idx, i in enumerate(sgRNA_plot_idxs):
Expand All @@ -633,6 +633,10 @@ def main():
newend = len(include_idxs) - 1
# and add it to the list
sub_sgRNA_intervals.append((newstart, newend))
sub_consensus_guides.append(consensus_guides[sgRNA_index])

# scale the include_idxs to be in terms of the plot centered around the sgRNA
sub_include_idxs = include_idxs - sgRNA_plot_idxs[0]

this_window_nuc_pct_quilt_plot_name = _jp(amplicon_plot_name.replace('.', '') + 'Nucleotide_percentage_quilt_around_sgRNA_'+sgRNA)
nucleotide_quilt_input = {
Expand All @@ -641,8 +645,8 @@ def main():
'fig_filename_root': f'{this_window_nuc_pct_quilt_plot_name}.json' if not args.use_matplotlib and C2PRO_INSTALLED else this_window_nuc_pct_quilt_plot_name,
'save_also_png': save_png,
'sgRNA_intervals': sub_sgRNA_intervals,
'sgRNA_sequences': consensus_guides,
'quantification_window_idxs': include_idxs,
'sgRNA_sequences': sub_consensus_guides,
'quantification_window_idxs': sub_include_idxs,
'custom_colors': custom_config['colors'],
}
debug('Plotting nucleotide percentage quilt for amplicon {0}, sgRNA {1}'.format(amplicon_name, sgRNA))
Expand All @@ -665,7 +669,7 @@ def main():
'conversion_nuc_to': args.conversion_nuc_to,
'save_also_png': save_png,
'sgRNA_intervals': sub_sgRNA_intervals,
'quantification_window_idxs': include_idxs,
'quantification_window_idxs': sub_include_idxs,
'custom_colors': custom_config['colors']
}
debug('Plotting nucleotide conversion map for amplicon {0}, sgRNA {1}'.format(amplicon_name, sgRNA))
Expand Down Expand Up @@ -754,7 +758,7 @@ def main():
crispresso2_info['results']['general_plots']['summary_plot_labels'][plot_name] = 'Composition of each base for the amplicon ' + amplicon_name
crispresso2_info['results']['general_plots']['summary_plot_datas'][plot_name] = [('Nucleotide frequencies', os.path.basename(nucleotide_frequency_summary_filename)), ('Modification frequencies', os.path.basename(modification_frequency_summary_filename))]
if args.base_editor_output and should_plot_large_plots(nucleotide_percentage_summary_df.shape[0], False, args.use_matplotlib):
this_nuc_conv_plot_name = _jp(amplicon_plot_name + 'Nucleotide_percentage_quilt')
this_nuc_conv_plot_name = _jp(amplicon_plot_name + 'Nucleotide_conversion_map')
conversion_map_input = {
'nuc_pct_df': nucleotide_percentage_summary_df,
'fig_filename_root': this_nuc_conv_plot_name,
Expand Down Expand Up @@ -807,6 +811,7 @@ def main():
'plot_path': plot_path,
'title': modification_type,
'div_id': heatmap_div_id,
'amplicon_name': amplicon_name,
}
debug('Plotting allele modification heatmap for {0}'.format(amplicon_name))
plot(
Expand Down Expand Up @@ -841,6 +846,7 @@ def main():
'plot_path': plot_path,
'title': modification_type,
'div_id': line_div_id,
'amplicon_name': amplicon_name,
}
debug('Plotting allele modification line plot for {0}'.format(amplicon_name))
plot(
Expand Down
12 changes: 6 additions & 6 deletions CRISPResso2/CRISPRessoReports/templates/batchReport.html
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ <h5 id="CRISPResso2_Batch_Output">{{report_name}}</h5>
{% if window_nuc_pct_quilts|length > 0 %}
<div class='card text-center mb-2'>
<div class='card-header'>
<h5>Nucleotide percentages around guides</h5>
<h5 id="nucleotide-header">Nucleotide percentages around guides</h5>
</div>
<div class='card-body'>
{% for plot_name in window_nuc_pct_quilts %}
Expand All @@ -86,11 +86,11 @@ <h5>{{report_data['titles'][plot_name]}}</h5>
{% if nuc_pct_quilts|length > 0 %}
<div class='card text-center mb-2'>
<div class='card-header'>
<h5>Nucleotide percentages in the entire amplicon</h5>
<h5 id="nucleotide-header-full-amplicon">Nucleotide percentages in the entire amplicon</h5>
</div>
<div class='card-body'>
{% for plot_name in nuc_pct_quilts %}
<h5>{{report_data['titles'][plot_name]}}</h5>
<h5>{{report_data['titles'][plot_name] if plot_name in report_data['titles'] else ''}}</h5>
{{ render_partial('shared/partials/fig_summaries.html', report_data=report_data, plot_name=plot_name) }}
{% endfor %}
</div>
Expand All @@ -104,7 +104,7 @@ <h5>Conversion of target bases around guides</h5>
</div>
<div class='card-body'>
{% for plot_name in window_nuc_conv_plots %}
<h5>{{report_data['titles'][plot_name]}}</h5>
<h5>{{report_data['titles'][plot_name] if plot_name in report_data['titles'] else ''}}</h5>
{{ render_partial('shared/partials/fig_summaries.html', report_data=report_data, plot_name=plot_name) }}
{% endfor %}
</div>
Expand All @@ -118,7 +118,7 @@ <h5>Conversion of target bases in the entire amplicon</h5>
</div>
<div class='card-body'>
{% for plot_name in nuc_conv_plots %}
<h5>{{report_data['titles'][plot_name]}}</h5>
<h5>{{report_data['titles'][plot_name] if plot_name in report_data['titles'] else ''}}</h5>
{{ render_partial('shared/partials/fig_summaries.html', report_data=report_data, plot_name=plot_name) }}
{% endfor %}
</div>
Expand All @@ -129,7 +129,7 @@ <h5>{{report_data['titles'][plot_name]}}</h5>
{% for plot_name in report_data['names'] %}
<div class='card text-center mb-2'>
<div class='card-header'>
<h5>{{report_data['titles'][plot_name]}}</h5>
<h5>{{report_data['titles'][plot_name] if plot_name in report_data['titles'] else ''}}</h5>
</div>
<div class='card-body'>
{{ render_partial('shared/partials/fig_summaries.html', report_data=report_data, plot_name=plot_name) }}
Expand Down
8 changes: 5 additions & 3 deletions CRISPResso2/CRISPRessoReports/templates/report.html
Original file line number Diff line number Diff line change
Expand Up @@ -317,7 +317,7 @@ <h5>Nucleotide composition for {{amplicon_name}}</h5>
{% for (data_label,data_path) in report_data['figures']['datas'][amplicon_name]['plot_2a'] %}
<p class="m-0"><small>Data: <a href="{{report_data['crispresso_data_path']}}{{data_path}}">{{data_label}}</a></small></p>
{% endfor %}

{% if 'plot_2b' in report_data['figures']['htmls'][amplicon_name] %}
{{ report_data['figures']['htmls'][amplicon_name]['plot_2b']|safe }}
{% elif report_data['figures']['sgRNA_based_names'][amplicon_name] and report_data['figures']['sgRNA_based_names'][amplicon_name]['2b']%}
Expand Down Expand Up @@ -623,9 +623,11 @@ <h5>Base editing for {{amplicon_name}}</h5>

{% block foot %}
<script>
{% if not C2PRO_INSTALLED %}
function updateZoom(e) {
/*prevent any other actions that may occur when moving over the image:*/
// e.preventDefault();

var img = e.target.imgObj
var view = e.target.viewObj
var lens = e.target.lensObj
Expand Down Expand Up @@ -718,8 +720,8 @@ <h5>Base editing for {{amplicon_name}}</h5>

{% endif %}
{% endfor %}
</script>

{% endif %}
</script>

{% if C2PRO_INSTALLED %}
<script src="https://unpkg.com/d3@5"></script>
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
<div id="fig_summary_{{plot_name}}">
{% if report_data['htmls'] and report_data['htmls'][plot_name]%}
{% if report_data['htmls'] and plot_name in report_data['htmls']%}
{{report_data['htmls'][plot_name]|safe}}
{% else %}
{% if plot_name in ['Nucleotide_conversion_map', 'Nucleotide_percentage_quilt'] %}
Expand Down
32 changes: 32 additions & 0 deletions CRISPResso2/CRISPRessoShared.py
Original file line number Diff line number Diff line change
Expand Up @@ -1655,6 +1655,38 @@ def get_sgRNA_mismatch_vals(seq1, seq2, start_loc, end_loc, coords_l, coords_r,
return list(set(this_mismatches))


def get_quant_window_ranges_from_include_idxs(include_idxs):
"""Given a list of indexes, return the ranges that those indexes include.
Parameters
----------
include_idxs: list
A list of indexes included in the quantification window, for example
`[20, 21, 22, 35, 36, 37]`.
Returns
-------
list
A list of tuples representing the ranges included in the quantification
window, for example `[(20, 22), (35, 37)]`. If there is a single index, it
will be reported as `[(20, 20)]`.
"""
quant_ranges = []
if include_idxs is None or len(include_idxs) == 0:
return quant_ranges
start_idx = include_idxs[0]
last_idx = include_idxs[0]
for idx in include_idxs[1:]:
if idx == last_idx + 1:
last_idx = idx
else:
quant_ranges.append((start_idx, last_idx))
start_idx = idx
last_idx = idx
quant_ranges.append((start_idx, last_idx))
return quant_ranges


######
# terminal functions
######
Expand Down
2 changes: 1 addition & 1 deletion CRISPResso2/args.json
Original file line number Diff line number Diff line change
Expand Up @@ -659,7 +659,7 @@
"skip_failed": {
"keys": ["--skip_failed"],
"help": "Continue with batch analysis even if one sample fails",
"action":"store_true",
"action": "store_true",
"tools": ["Batch", "Pooled", "WGS"]
},
"min_reads_for_inclusion": {
Expand Down
26 changes: 25 additions & 1 deletion tests/unit_tests/test_CRISPRessoShared.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ def test_get_mismatches():
-5,
-3,
)
assert len(mismatch_cords) == 6
assert len(mismatch_cords) == 6

def test_get_relative_coordinates():
s1inds_gap_left, s1inds_gap_right = CRISPRessoShared.get_relative_coordinates('ATCGT', 'TTCGT')
Expand Down Expand Up @@ -74,3 +74,27 @@ def test_get_relative_coordinates_ind_and_dels():
assert s1inds_gap_left == [0, 2, 2, 2, 3]
assert s1inds_gap_right == [0, 2, 3, 3, 3]


def test_get_quant_window_ranges_from_include_idxs():
include_idxs = [0, 1, 2, 10, 11, 12]
assert CRISPRessoShared.get_quant_window_ranges_from_include_idxs(include_idxs) == [(0, 2), (10, 12)]


def test_get_quant_window_ranges_from_include_idxs_empty():
include_idxs = []
assert CRISPRessoShared.get_quant_window_ranges_from_include_idxs(include_idxs) == []


def test_get_quant_window_ranges_from_include_idxs_single():
include_idxs = [50, 51, 52, 53]
assert CRISPRessoShared.get_quant_window_ranges_from_include_idxs(include_idxs) == [(50, 53)]


def test_get_quant_window_ranges_from_include_idxs_single_gap():
include_idxs = [50, 51, 52, 53, 55]
assert CRISPRessoShared.get_quant_window_ranges_from_include_idxs(include_idxs) == [(50, 53), (55, 55)]


def test_get_quant_window_ranges_from_include_idxs_multiple_gaps():
include_idxs = [50, 51, 52, 53, 55, 56, 57, 58, 60]
assert CRISPRessoShared.get_quant_window_ranges_from_include_idxs(include_idxs) == [(50, 53), (55, 58), (60, 60)]

0 comments on commit 3f62643

Please sign in to comment.