Skip to content

Commit

Permalink
docs: profiling local CoreFX build using PerfView (#838)
Browse files Browse the repository at this point in the history
  • Loading branch information
adamsitnik authored Sep 3, 2019
1 parent 6c53679 commit 9d8f9bf
Show file tree
Hide file tree
Showing 26 changed files with 138 additions and 0 deletions.
Binary file added docs/img/perfview_0_welcome.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/perfview_10_no_grouping_applied.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/perfview_11_unsolved_symbols.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/perfview_12_lookup_symbols.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/perfview_13_set_time_range.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/perfview_15_by_name.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/perfview_16_go_to_callers.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/perfview_17_callers.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/perfview_18_call_tree.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/perfview_19_flame_graph.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/perfview_1_collect.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/perfview_21_goto_source.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/perfview_22_source_code.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/perfview_23_regression.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/perfview_24_regression_report.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/perfview_2_run_corerun.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/perfview_3_log_button.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/perfview_4_cpu_stacks.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/perfview_5_select_process.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/perfview_6_cpu_stacks_not_filtered.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/perfview_7_cpu_metric.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/perfview_8_grouppats_on.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/perfview_9_grouppats_off.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
138 changes: 138 additions & 0 deletions docs/profiling-workflow-corefx.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,12 @@
- [CPU](#CPU)
- [CoreRun](#CoreRun)
- [Allocation Tracking](#Allocation-Tracking)
- [PerfView](#PerfView)
- [CPU Investigation](#CPU-Investigation)
- [Filtering](#Filtering)
- [Analyzing the Results](#Analyzing-the-Results)
- [Viewing Source Code](#Viewing-Source-Code)
- [Identifying Regressions](#Identifying-Regressions)


## Prerequisites
Expand Down Expand Up @@ -251,3 +257,135 @@ Again, if you have configured everything properly you are able to right click on
![Go to Source File](img/vs_profiler_14_memory_goto_source.png)

![Actual Source File](img/vs_profiler_15_memory_source_file.png)

### PerfView

PerfView is the ultimate .NET Profiler and if you are new to PerfView it's recommended to read it's tutorial or watch the tutorial [videos](https://channel9.msdn.com/Series/PerfView-Tutorial).

![Welcome Screen](img/perfview_0_welcome.png)

#### CPU Investigation

We can **Collect** profile data by either **Run**ning a standalone executable (or command) or **Collect**ing the data machine wide with explicit start and stop.

![Collect Menu Items](img/perfview_1_collect.png)

When profiling local CoreFX build it's recommended to Run a standalone repro app using CoreRun.

![Run CoreRun](img/perfview_2_run_corerun.png)

If for some reason PerfView fails to run the command you need to click the `Log` button in left bottom corner and read the logs to find out what happened.

![Log button](img/perfview_3_log_button.png)

Once PerfView finishes the run you need to open the `CPU Stacks`

![CPU Stacks](img/perfview_4_cpu_stacks.png)

By default PerfView collects the profile data machine wide. Once you click on the `CPU Stacks` it shows you a window with the list of processes active during profile session. It sorts the processes descending by CPU time so the most active processes are at the top. You just need to select `CoreRun` and click `OK`.

![Select Process](img/perfview_5_select_process.png)

Once you do that it opens the `CPU Stacks` window with the default filtering applied.

![Non filtered CPU Stacks](img/perfview_6_cpu_stacks_not_filtered.png)

The first thing you need to do is to make sure that you have at least a few 1000 samples (preferably over 5000) and that the process is actually CPU bound.

The `Metric/Interval` is a quick measurement of how CPU bound the trace is as a whole. A value of 1 indicates a program that on average consumes all the CPU from a single processor. Unless that is high, your problem is not CPU (it can be some blocking operation like network/disk read).

![CPU Metric](img/perfview_7_cpu_metric.png)

##### Filtering

Fundamentally, what is collected by the PerfView profiler is a sequence of stacks. A stack is collected every millisecond for each hardware processor on the machine. This is very detailed information and hence by default PerfView groups the stacks. This is very useful when you are profiling a real-world application in a production environment, but when you work on the .NET Team and you profile some simple repro app you care about all details and you don't want the results to be grouped by modules.

![Group pats](img/perfview_8_grouppats_on.png)

This is why you should disable the grouping by selecting `[no grouping]` item from `GroupPats` combo box. If you want to learn more about `GroupPats` just click on it (it looks like a web link and it opens help).

![No grouping](img/perfview_9_grouppats_off.png)

Now you can see much more details:

![No grouping applied](img/perfview_10_no_grouping_applied.png)

But some of the results contain `!?` which means unsolved symbols.

![Unsolved symbols](img/perfview_11_unsolved_symbols.png)

To solve the symbols you just need to select given names, right click and choose `Lookup symbols`. You can also just press `Ctrl+A` (select all) and `Alt+S` (Lookup symbols).

![Lookup symbols](img/perfview_12_lookup_symbols.png)

PerfView does an excellent job at solving symbols. It's capable of solving not only the app code symbols but also .NET Runtime and Windows OS symbols. It's one of the moments when it shines compared to other profilers that very often struggle with that.

When we are using the `Collect -> Run` option PerfView profiles entire process lifetime. It's very useful when you are interested in things like .NET Runtime startup performance. But typically you just care about specific method. So the next thing you need to do is to set the Time Range. You can do that by selecting the interesting method, right clicking and choosing `Set Time Range`. Or just pressing `Alt+R`.

![Set Time Range](img/perfview_13_set_time_range.png)

As you can see, all the methods that were executed before the first and after last call to `DateTime.UtcNow` got filtered out:

![Time Range Applied](img/perfview_14_set_time_range_before_vs_after.png)

`When` - This is a visualization of how the INCLUSIVE samples collected for that node vary over time. The total range (from the Start and End text boxes), is divided into 32 time 'TimeBuckets' and the inclusive samples for that node are accumulated into those 32 buckets. Each bucket is then represented as a digit that represents a scaled value.

This simple text representation of histogram can be very useful when profiling more complex scenarios, but in this case it just shows us that `DateTime.UtcNow` was executed all the time. But this is exactly what we wanted!

##### Analyzing the Results

Once we get the data filtered we can start the analysis.

The `By Name` tab contains the list of all methods captured during profile time. This list is by default sorted in descending order by **Exc**lusive CPU time. It means that the most expensive methods which perform actual CPU computation are at the top of the table. These are the methods you care about.

![By name tab](img/perfview_15_by_name.png)

To find out who is calling the most time consuming method you can right click on it and select `Goto -> Goto Item in Callers` or just press `F10`.

![Go to callers](img/perfview_16_go_to_callers.png)

![Callers](img/perfview_17_callers.png)

If you wish you can see the entire `Call Tree` by clicking on the `Call Tree` tab:

![Call Tree](img/perfview_18_call_tree.png)

`Flame Graph` offers you a different form of visualization::

![Flame Graph](img/perfview_19_flame_graph.png)

The graph starts at the bottom. Each box represents a method in the stack (inclusive CPU time). Every parent is the caller, children are the callees. The wider the box, the more time it was on-CPU.

For the leaf nodes the inclusive time == exclusive time. The difference between the parent and children box width (marked with red on the image below) is the exclusive parent (caller) time.

```cs
parent.InclusiveTime - children.InclusiveTime = parent.ExclusiveTime
```

![Flame Graph Exclusive time](img/perfview_20_flame_graph_exclusive_time.png)

##### Viewing Source Code

If you want to view the Source Code of the given method you need to right-click on it and select `Goto Source (Def)` menu item. Or just press `Alt+D`.

![Goto Source](img/perfview_21_goto_source.png)

![Source Code](img/perfview_22_source_code.png)

If PerfView fails to show you the source code you should read the `Log` output. If you are unable to make it work and you really care about it you should switch to Visual Studio Profiler.

**Note:** As of today, PerfView keeps the `.pdb` files [opened](https://github.com/microsoft/perfview/pull/979) after showing the source code. It means that if you keep the trace file opened in PerfView and try to rebuild CoreFX/CLR the build is going to fail. You might need to close PerfView to rebuild CoreFX/CLR.
##### Identifying Regressions

PerfView has a built-in support for identifying regressions. To use it you need to:

* profile two different versions of .NET Core running exact same C# code
* open both trace files in PerfView and apply exactly the same filters
* choose `Regression -> With Baseline` from the top Menu

![Two trace files](img/perfview_23_regression.png)

![Regression Report](img/perfview_24_regression_report.png)

It's recommended to use it instead of trying to eyeball complex Flame Graphs.

0 comments on commit 9d8f9bf

Please sign in to comment.