Skip to content

Commit

Permalink
Mckay/pd warnings (#45)
Browse files Browse the repository at this point in the history
* refactor errors='ignore' to try except

* refactored integer slice to iloc[]

* moved to_numeric try except to function

* Refactor to_numeric_ignore_errors to to_numeric_ignore_columns

This change is slightly cleaner because it addresses the root issue that some
columns are strings (and can therefore not be converted to numeric types). Now
if an error does occur when converting the dfs to numeric types it won't be
swallowed up.

* Add documentation to to_numeric_ignore_columns

---------

Co-authored-by: Cole Lyman <[email protected]>
  • Loading branch information
mbowcut2 and Colelyman committed Nov 8, 2024
1 parent ca59a47 commit b44bbdf
Showing 1 changed file with 6 additions and 7 deletions.
13 changes: 6 additions & 7 deletions CRISPResso2/CRISPRessoCORE.py
Original file line number Diff line number Diff line change
Expand Up @@ -4791,15 +4791,14 @@ def get_scaffold_len(row, scaffold_start_loc, scaffold_seq):
mod_pcts.append(np.concatenate(([ref_name, 'Deletions'], np.array(ref1_all_deletion_count_vectors[ref_name]).astype(float)/tot)))
mod_pcts.append(np.concatenate(([ref_name, 'Substitutions'], np.array(ref1_all_substitution_count_vectors[ref_name]).astype(float)/tot)))
mod_pcts.append(np.concatenate(([ref_name, 'All_modifications'], np.array(ref1_all_indelsub_count_vectors[ref_name]).astype(float)/tot)))
mod_pcts.append(np.concatenate(([ref_name, 'Total'], [counts_total[ref_name]]*refs[ref_names[0]]['sequence_length'])))
colnames = ['Batch', 'Modification']+list(refs[ref_names[0]]['sequence'])
mod_pcts.append(np.concatenate(([ref_name, 'Total'], [counts_total[ref_name]]*refs[ref_names_for_pe[0]]['sequence_length'])))
colnames = ['Batch', 'Modification']+list(refs[ref_names_for_pe[0]]['sequence'])
pe_modification_percentage_summary_df = to_numeric_ignore_columns(pd.DataFrame(mod_pcts, columns=colnames), {'Batch', 'Modification'})

sgRNA_intervals = refs[ref_names[0]]['sgRNA_intervals']
sgRNA_names = refs[ref_names[0]]['sgRNA_names']
sgRNA_mismatches = refs[ref_names[0]]['sgRNA_mismatches']
sgRNA_sequences = refs[ref_names[0]]['sgRNA_sequences']
include_idxs_list = refs[ref_names[0]]['include_idxs']
sgRNA_intervals = refs[ref_names_for_pe[0]]['sgRNA_intervals']
sgRNA_names = refs[ref_names_for_pe[0]]['sgRNA_names']
sgRNA_mismatches = refs[ref_names_for_pe[0]]['sgRNA_mismatches']
include_idxs_list = refs[ref_names_for_pe[0]]['include_idxs']

plot_root = _jp('11a.Prime_editing_nucleotide_percentage_quilt')
pro_output_name = f'plot_{os.path.basename(plot_root)}.json'
Expand Down

0 comments on commit b44bbdf

Please sign in to comment.