Skip to content

Missing code snippets

karthik-soman edited this page Nov 16, 2021 · 1 revision

Find the missing piece of code from here!

Following code snippets belong to the notebook named patient_spoke_sig_analysis

Make sure, you have tried yourself before referring here

 

Look at cohort demographics

for col in ['Sex','Race','Ethnicity']:
    df = example_cohort[['Disease', col, 'patient_id']].groupby(['Disease', col]).count().reset_index().rename(index=str, columns={'patient_id':'Count'})
    ax=sns.barplot(x='Disease', y='Count', hue=col, data=df)
    plt.show()

 

Look at cohort continuous variables

for col in ['Age', 'OMOP_Count', 'SEP_Count']:
    ax=sns.boxplot(x='Disease', y=col, data=example_cohort)
    plt.show()

 

Visualize cohort in 3d

%matplotlib notebook

fig = plt.figure()
ax = fig.add_subplot(projection='3d')
for disease, name in zip(diseases, disease_names):
    pats = example_cohort[example_cohort.Disease==name].Patient_Index.values
    plt.scatter(patient_to_disease_dist[:,0][pats], patient_to_disease_dist[:,1][pats], patient_to_disease_dist[:,2][pats], label=name)
plt.legend()
plt.show()

 

Compare new patients to disease PSEVs

new_patient_to_disease_dist = cdist(new_spoke_sigs, psev_matrix, metric='cosine')
best_match = np.array(np.array(disease_names)[np.argmin(new_patient_to_disease_dist, axis=1)])
print(np.sum(new_cohort.Disease.values==best_match)/len(new_cohort))
new_cohort.loc[:,'pred'] = best_match
new_cohort.loc[:,'match_correct'] = new_cohort.Disease.values == new_cohort.pred.values
match_stats_df = new_cohort[['Patient_Index', 'Disease', 'pred']].groupby(['Disease','pred']).count().reset_index()
match_stats_df