-
Notifications
You must be signed in to change notification settings - Fork 68
Capture Performance Telemetry #39
Comments
I thought I could implement this in the vscode adapter layer, unfortunately i need to send a sequence number. |
We can certainly do it on ptvsd side of things, but we'll need some proper logging there first in general. |
Looking at the following comment (https://github.com/Microsoft/ptvsd/blob/master/ptvsd/ipcjson.py#L91) I assumed the plan was not to have any logging at that layer. I patched PTVSD to include some logging for my own dev purposes. |
Oh, not at all - ipcjson.py comes from the old ptvsd, and I only made minor changes to it to accommodate some unusual attribute names. If you look at wrapper.py, it has some print logging commented out. But the idea is to do the same thing pydevd does - check for a certain environment variable, and log all traffic to the file it specifies if it's there. |
@int19h @karthiknadig Telemetry for: Time to start debugger, etc. This will be a blocker for us shipping the wheels, as we won't know whether shipping wheels adversely impacts debugger or not (i.e. there's no data) |
We need to identify a specific list of metrics. And some of these will need to be implemented on VSCode side - e.g. time to start the debugger has to be measured on that side, because it includes the process spawn time, which cannot be measured by the process itself. VSCode could measure it by starting a timer when debug session starts, and stopping it when it sees the first "thread" event - this is the most reliable indicator that user code has started running in the debuggee process. However, binary wheels also affect the execution of user code by improving tracing performance, so we also need to measure that somehow. We could measure the total time spent in user code - i.e. time during which the process isn't paused. But the problem is that for any app that blocks while running (e.g. reads input, or listens for socket connection, or waits for a lock etc), our measurements will include those blocking waits, which will dominate perf. Then we need to consider multithreading, given that some threads might be paused while others are running, and that GIL really only allows one thread running at a time. This all needs to be very carefully designed for us to get any meaningful data out of this, and it won't be trivial to implement. We might want to do some manual perf testing first, on specifically written test code with various patterns (which can measure itself with the |
One other thing. Our logs already have high-resolution timestamps on every event, which start counting from the moment logging is initialized (which happens right after we parse the command line). So if you capture a full log, you can see how long the entire debug session took from debugger's perspective, and you can see how long individual events took by comparing timestamps for messages. For example, to measure a step, you can look at the timestamp for the "stepIn" request, and the timestamp for the "stopped" event that followed - this will correspond pretty closely to the delay observed by the user in the UI. |
Yes this is possible. However the metrics are for the PM team. |
As a note, I do have some performance numbers on some scenarios (which I
usually rerun when I do changes I think will affect performance).
It has numbers for the regular run (without compiled speedups), a run with
only cython and a run with cython+frame eval on the scenarios where I think
that the debugger is affected most -- the usual problem for the debugger is
deciding fast whether to not trace a method or if it has to trace how fast
a line is handled to hit some breakpoint or doing a step.
The numbers are committed in the repository:
https://github.com/microsoft/ptvsd/blob/master/src/ptvsd/_vendored/pydevd/tests_python/performance_check.py#L193
As for getting numbers from the telemetry on user machines, I think that
the startup time is a reasonable metric, but apart from that, the times for
a user machine would vary too much based on what the user/program is doing,
so, I'm not sure how useful such a metric would be (it'd be nice to have an
explanation on what exactly the PM team wants from this telemetry to check
how to produce numbers that actually match that expectation).
…On Thu, Aug 15, 2019 at 10:20 PM Don Jayamanne ***@***.***> wrote:
So if you capture a full log, you can see how long the entire debug
session took from debugger's perspective, and you can see how long
individual events took by comparing timestamps for messages
Yes this is possible. However the metrics are for the PM team.
Best discussed separately. I.e. build a tool to extract and compare the
values or capture the metrics via debugger telemetry events (and use
existing tools).
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#39?email_source=notifications&email_token=AAA4W5KASBF3HIZW7GGVGYLQEX6GDA5CNFSM4EPI22AKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4NM57Y#issuecomment-521850623>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAA4W5J777JXPHFI3OEBK53QEX6GDANCNFSM4EPI22AA>
.
|
Of course having an exact measure of what the user is doing would be ideal. Don and I spoke about this yesterday...for example, if they have some kind of 'sleep' within their code (as just an example), that could muddy the waters, but from a reporting standpoint, as long as we are averaging what is happening across all users I think we should be fine. @DonJayamanne mentioned that this was done in the past. Please let me know if I didn't summarize that correctly, Don. I think a reasonable first pass is to use existing tools, come to an agreement on what we want to measure, and get it in...unless building the more 'accurate' tool can be done in a reasonable amount of time. I can speak with @qubitron about this next week. |
Telemetry to calculate how long it takes to load debugger is done. Any future telemetry items should have separate work items. Also, capturing telemetry from the IDE side using DebugAdapterTracker allows us to capture telemetry without depending on the debugger side changes. |
In the recent release of VS Code we introduced some changes to capture some telemetry on the performance of the debugger, specifically:
stepIn
,stepOut
,continue
,next
Now, I don't see any code in PTVSD to capture telemetry. Thats fine.
The problem is we'll need this, without this we cannot measure the performance improvements (and this is a must for the first release).
So the question is:
If this is something we're going to need for VS as well, then it must be done in PTVS.
The text was updated successfully, but these errors were encountered: