Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving QIIME 2 summary visualizations #72

Closed
fedarko opened this issue Sep 17, 2019 · 7 comments · Fixed by #80
Closed

Improving QIIME 2 summary visualizations #72

fedarko opened this issue Sep 17, 2019 · 7 comments · Fixed by #80

Comments

@fedarko
Copy link
Collaborator

fedarko commented Sep 17, 2019

This came up in #70. For some background here: I tried using these options back in July, couldn't make sense of the plots, and when I asked about it the response was opening #63 :) ...So I ended up just running Songbird outside of QIIME 2 for that project.

Since generating TF logging files through QIIME 2 sounds like it'd be a hassle, keeping these plots in the Q2 Songbird plugin (and improving them, to be usable by people who aren't intimately familiar with ML/Songbird/Tensorflow) would make sense.

So! Here's what the current output of qiime songbird summarize-single looks like (zoomed out to show the full display)—

Screen Shot 2019-09-17 at 3 27 32 PM

And here's the current output from qiime songbird summarize-paired
Screen Shot 2019-09-17 at 3 35 17 PM

For reference, here's a screenshot of tensorboard showing a single run:

Screen Shot 2019-09-17 at 3 40 16 PM

Ideas for improving the Q2 summary plots

  1. Make the y-axes (Loglikelihood and Cross validation score) more descriptive, because the meanings aren't clear (I've talked to other people who also aren't sure how to interpret these).

    • I get that Tensorboard's labels also aren't that useful, but the saving grace there is that Songbird's README does a super good job of explaining how to interpret those plots. If we can at least make it clear which of the summary plots corresponds to which of the Tensorboard plots, that would make interpreting these a lot easier. (I'm guessing it's switched around now, and that the bottom Cross validation score plot corresponds to the top cv_error Tensorboard plot?)
    • When I first tried using summarize-single back in July, the shape of the plots was a lot grosser than the shape shown here. (The summarize-single plot here was generated using already-tuned-using-Tensorboard hyperparameters, which is why it looks nice.) Users who run into this page will likely start off with gross plots that don't resemble the expected shape at all, which is why having context is important.
  2. Add text explaining how to interpret these plots

    • This could be mostly copy-pasted from the README (we could even add a hyperlink to the relevant part(s) of the README, as an alternative).
    • This could cover common problems people run into, e.g. "so if you don't see anything in these plots you probably need to increase decrease your --p-summary-interval".

These are cosmetic changes, mostly, so fixing them shouldn't be too difficult. I'd like to hear your thoughts on these ideas—can start working on a PR to resolve this immediately afterwards, either tonight or tomorrow.

@fedarko
Copy link
Collaborator Author

fedarko commented Sep 17, 2019

This relates to #42, btw.

@fedarko
Copy link
Collaborator Author

fedarko commented Sep 17, 2019

Example of the last bullet point above, adding in info for the user about what to do if plots don't show up:
Screen Shot 2019-09-17 at 4 22 50 PM

Should be very feasible to either add more "FAQ"-style text to the summary HTML page, or add links to the relevant parts of the README.

fedarko added a commit to fedarko/songbird that referenced this issue Sep 17, 2019
@mortonjt
Copy link
Collaborator

@fedarko , thank you for raising this issue. Perhaps the easiest way to address this is to have inline text within the html below the figures describing the axes.

@fedarko
Copy link
Collaborator Author

fedarko commented Sep 18, 2019

@mortonjt Just to check, before I make more changes: these plots are switched around from in Tensorboard, right? That is, the top plot in the q2 summary (Loglikelihood) corresponds to the loss plot in Tensorboard, and the bottom plot in the q2 summary (Cross validation score) corresponds to the cv_error plot in Tensorboard?

(It looks like this is the case from the code below.)

convergence_stats = pd.DataFrame(
{
'loglikehood': loss,
'cross-validation': cv,
'iteration': its
}
)

I think reordering the plots to match Tensorboard's order + altering the axis labels (or at least Loglikelihood) to more closely match the Tensorboard plots will be a good step. Will work on this later today, pending your confirmation above.

@mortonjt
Copy link
Collaborator

mortonjt commented Sep 18, 2019 via email

fedarko added a commit to fedarko/songbird that referenced this issue Sep 18, 2019
fedarko added a commit to fedarko/songbird that referenced this issue Sep 18, 2019
fedarko added a commit to fedarko/songbird that referenced this issue Sep 19, 2019
shouldn't cause the plots to get shifted to the right any more.

also highlights the label ("Pseudo Q-squared:") in <strong> tags.
@fedarko
Copy link
Collaborator Author

fedarko commented Sep 19, 2019

Thanks @mortonjt! Ok, I've made a few changes to the output summaries. also restructured + updated the README pretty heavily, see https://github.com/fedarko/songbird. will try to add links between the new summaries and new README sections tomorrow + get this PR sent in.

Thanks for your help with this, and sorry to pester you with all of these updates—I think having this all be better documented will make a lot of people's times using Songbird easier, and I think that's something worth taking care of sooner rather than later :)

@mortonjt
Copy link
Collaborator

@fedarko , this is awesome! Looking forward to seeing the PR!

fedarko added a commit to fedarko/songbird that referenced this issue Sep 20, 2019
fedarko added a commit to fedarko/songbird that referenced this issue Sep 20, 2019
also updated the README summary screenshot accordingly.

I think now is kosher to submit a pr for biocore#72!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants