Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tensorboard and qiime2 #70

Open
mortonjt opened this issue Sep 17, 2019 · 10 comments
Open

Tensorboard and qiime2 #70

mortonjt opened this issue Sep 17, 2019 · 10 comments
Labels
dependencies tensorflow tensorflow tensorflow tensorflow tensorflow question Further information is requested

Comments

@mortonjt
Copy link
Collaborator

At the moment, tensorboard is disabled when running the qiime2 songbird command.
The original motivation was because an additional logging directory needs to be created in order to enable this.

I'm not sure how qiime2 is expected to handle logging directories - should they be typed?

Thoughts @fedarko @ebolyen?

@mortonjt mortonjt added the question Further information is requested label Sep 17, 2019
@ebolyen
Copy link
Member

ebolyen commented Sep 17, 2019

Is the goal to keep this directory around? Creating a well managed temporary directory is just fine from QIIME 2's perspective (dumping things into the current working directory or specific locations is usually where we take issue).

@mortonjt
Copy link
Collaborator Author

Right, the goal would be to create a logging directory that can be viewed through Tensorboard after the run is completed. This can help users debug their runs and compare have different runs differ (i.e. compare and contrast different covariates, priors, ...).

PR #71 currently creates this logging dir in the current directory (it has been disabled previously).

@ebolyen
Copy link
Member

ebolyen commented Sep 17, 2019

I see! We don't have an equivalent concept, but @thermokarst and I have been chatting recently about "diagnostic" outputs. This could help inform that design. From poking around, it looks like tensorboard is a server, and probably requests data from the logging directory as needed. So it doesn't fit into a full visualization on the surface.

@fedarko
Copy link
Collaborator

fedarko commented Sep 17, 2019

Does #71 (just writing to a logging directory, ignoring the standard notion of "output" in QIIME 2) have any actual downsides, aside from maybe not being the most elegant solution? Using future functionality baked in to QIIME 2 to handle this would be super nice, but IMO what's really important for now is making these diagnostics available to the user—to my understanding, this is pretty much required to make running Songbird (or mmvec?) useful, and while I can't speak on mmvec this isn't available when running qiime songbird right now.

One quick way to avoid the problem of dumping stuff into the current working directory (sort of :) might be adding a required --p-diagnostic-directory parameter (ideally specified relative to the CWD), and then having tensorflow output diagnostics to there. I recognize that this doesn't really fit with the current paradigm of having QIIME 2 take care of the I/O, but my vote would be strongly for a temporary solution like that instead of letting the problem sit.

@ElDeveloper
Copy link
Member

ElDeveloper commented Sep 17, 2019 via email

@mortonjt
Copy link
Collaborator Author

RE @fedarko / @ElDeveloper , that's basically what the standalone CLI is currently doing - if the user specifies a logging directory, it'll save Tensorboard files there. Otherwise, it'll automatically create a folding named by timestamp.

The main reason why I didn't push this into the q2 CLI this seems to go against q2 standards - all inputs / outputs need to be modularized in some shape or form.

Another work around is to have another output type (i.e. tensorboard type) that can be unzipped and visualized into Tensorboard. The diagnostics is just one big json - so it shouldn't be too big of an issue to just have a type for that.

Then it'll raise the question, where will this type live? Will this be in main q2_types? Or should this be in an auxiliary repo? This will be something that all packages using pytorch / Tensorflow will likely rely on.

@fedarko
Copy link
Collaborator

fedarko commented Sep 17, 2019

The diagnostics is just one big json - so it shouldn't be too big of an issue to just have a type for that.

Wait, really? In my experience running Songbird outside of QIIME 2, I always got a folder with 5 files (not counting the output differentials), and none of these files are JSON. (Attaching screenshot of what this looks like. The events.out.tfevent... file is by far the biggest file here, weighing in at 992 MB.)
Screen Shot 2019-09-17 at 1 51 16 PM

If saving tensorboard stuff without doing it through Q2 is really really really not kosher, then an alternative would be not following through with #63 and instead improving the qiime songbird summarize-single command to make its visualization a reasonable-enough approximation of tensorboard (at minimum, adding like a sentence next to each plot that says "this is what this plot represents"—would probably be possible to straight-up copy this from the README's FAQ). This doesn't have to be fancy, but it would enable basic diagnostics in a Q2-compliant way.

Regardless of which solution people want (just writing TF results to a logging dir, making TF output a Q2 type somehow, improving the current summary command to produce a visualization with adequate labels and context), I'm happy to help with the corresponding implementation. Just let me know.

@mortonjt
Copy link
Collaborator Author

I misremembered the actual format -- that can be found here.

But yes, the events.out* contains all of the logging information -- the other files are for checkpointing (which we often don't need to use in songbird -- that's only for jobs that takes days to run).

Regarding your comment on the summary commands - that was the initial purpose. Do you have specific issues that you ran into? Feel free to raise issues so that we can start patching them.

@fedarko
Copy link
Collaborator

fedarko commented Sep 18, 2019

If we can hammer out #72 satisfactorily, then I think kicking the can down the road on getting Tensorboard + Q2 more tightly integrated is ok (...I suppose it's not really my call anyway :). Thanks all for the discussion, and sorry for the rants.

@ebolyen
Copy link
Member

ebolyen commented Sep 18, 2019

So I looked more closely at how tensorboard works, and while its not beyond the realm of possibility to static-ify the program (see tensorflow/tensorboard#2045; there is a jupyter client, which expects a backend, but like described in the issue, we have a way to "cache" those requests in QZVs if we can change the host for the embedded client). That seems like a lot of work and is kind of brittle.

Pragmatically I am fine with a --p-diagnostic-dir or --p-tensorboard-log however supersystems (like Qiita) would probably prefer that their permanent filesystem is not be arbitrarily changed, which is why we so strongly discourage that kind of thing.

I am of course all for having nicer visualizations as well (I might recommend Vega here). Also, there's no reason we couldn't have our cake and eat it too, we could put the log directory into the visualization, so that savvy users can still find it, it's still attached to provenance, and a sufficiently advanced visualization can read directly from it in an ad-hoc way.

@fedarko fedarko added the dependencies tensorflow tensorflow tensorflow tensorflow tensorflow label Feb 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies tensorflow tensorflow tensorflow tensorflow tensorflow question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants