-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve Communication Protocol #14
Comments
It looks like jaeger doesn't yet have a client side API: jaegertracing/jaeger-client-node#109 jaegertracing/jaeger#723 It's being developed here: https://github.com/jaegertracing/jaeger-client-javascript As a workaround, I suppose we can setup a proxy server that is also running which we can hit from the frontend to send to the open tracing server. |
I am experimenting with Jaeger and I notice currently I guess it isn't set up to show spans that haven't finished (jaegertracing/jaeger#729). This is rather too bad, because it would be nice to see a debugging view for a chart as you are interacting with it. But maybe I should just make each interaction into separate spans. So there will be one initial span for setting it up, then another for each UI update. |
This sounds like you are going to reinvent the jupyter kernels + messaging protocol 😅 Why would it need to be non-jupyter? |
@vidartf b/c jupyter is heavyweight! You might wanna back this by a simple flask server over REST or a websockets server. Not connected to a kernel, just backed by a regular python process. It's possible we wanna re-use the jupyter comms spec and just create other backends for it, to allow it to be run without running the jupyter server. Or its possible we wanna back it another spec like gRPC and have jupyter comms be a backend for that. |
I'm pretty sure this is how ipython started though 😉 More seriously, it would be interesting to hear which features would explicitly be included/excluded compared to the full jupyter_client + ipykernel + kernel manager/handlers case. |
Currently, this extension uses Jupyter comms, which go over websocket, to send queries and data as the visualization is executing.
This "works," but it would be useful to make it easier to profile, debug, and switch out of Jupyter.
Profiling
For example, many queries are very slow. We need to be able to diagnose why that is. Is it data parsing and serialization? SQL response time? Ibis query creation?
To answer these properly, we should implement some form of "distributed tracing." Here is a New Relic UI that visualizes a trace of a request:
There is a W3 group "Distributed Tracing Working Group". They are working on a "Trace Context" spec.
Jaeger is a Cloud Native Computing Foundation project that exists today that says it will support this spec in the future:
My proposal is to try deploying Jaeger next to JupyterLab as a server extension and creating a frontend extension to display it's UI in JupyterLab. That way, we can visualize the queries we are executing as we interact with the graphs and inspect their performance. We will also have to instrument our Python library and pass certain tokens along with the request/responses to keep each request/response together.
Protocol Agnostic
The second issue here is that currently this mime renderer is very tied to being run in JupyterLab. We have to add a special case for running it in Phoilla so that it can access comms: vidartf/phoila#7
Over the past week, the idea of creating a mime render that needs to speak to a kernel but which you deploy outside of Jupyter has come up multiple times (cc @dharhas). To do this, we need to layer some standards on top of our current approach. A mime render on the client side should be a JS library that takes in a handle to a bidirectional async channel to the kernel. And on the server it should output some mimetype and also have a handle on this bidirectional communication channel.
The idea here is that you prototype in a jupyter notebook, but then you can extract out that cell from a notebook and run it on a non jupyter server, possibly embeded in some larger web app, and we should be able to run a non jupyter Python process on the backend. And hook up them up to each other to have bidrectional communication.
Honestly, I am not sure what we should use here. The requirements would ideally be:
gRPC seems like a good contender here, since it is also a Cloud Native Computing Foundation project and has good adoption. It's web story is just emerging, but there is at least one client with websocket support.
I don't know how Panel fits in here. I imagine they have implemented something here.
The text was updated successfully, but these errors were encountered: