Skip to content

Commit

Permalink
Add profiler doc
Browse files Browse the repository at this point in the history
  • Loading branch information
stu1130 committed Mar 4, 2021
1 parent cb352ad commit 0ec8bb1
Show file tree
Hide file tree
Showing 2 changed files with 56 additions and 0 deletions.
55 changes: 55 additions & 0 deletions docs/development/profiler.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
## Profiler (Experimental)

Currently, DJL supports experimental profilers for developers that
investigate the performance of operator execution as well as memory consumption.
As we are still working in progress on the feature, different engines have different APIs and produce different output format.
In the future, we are considering to design a unified APIs and output unified format.

### MXNet

By setting the following environment variable, it generates `profile.json` after executing the code.
```
export MXNET_PROFILER_AUTOSTART=1
```
You can view it in a browser using trace consumer like `chrome://tracing `. Here is a snapshot that shows the sample output.
![img](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/tutorials/python/profiler/profiler_output_chrome.png)

### PyTorch

DJL have integrated PyTorch C++ profiler API and expose `JniUtils.startProfile` and `JniUtils.stopProfile(outputFile)` Java APIs.
`JniUtils.startProfile` takes `useCuda(boolean)`, `recordShape(boolean)` and `profileMemory(boolean)` arguments respectively.
`useCuda` indicates if profiler enables timing of CUDA events using the cudaEvent API.
`recordShape` indicates if information about input dimensions will be collected or not.
`profileMemory` indicates if profiler report memory usage or not.
`JniUtils.stopProfile` takes a outputFile of String type.

Wrap the code snippet you want to profile in between `JniUtils.startProfile` and `JniUtils.stopProfile`.
Here is an example
```
try (ZooModel<Image, Classifications> model = ModelZoo.loadModel(criteria)) {
try (Predictor<Image, Classifications> predictor = model.newPredictor()) {
Image image = ImageFactory.getInstance()
.fromNDArray(manager.zeros(new Shape(3, 224, 224), DataType.UINT8));
JniUtils.startProfile(false, true, true);
predictor.predict(image);
JniUtils.stopProfile(outputFile);
} catch (TranslateException e) {
e.printStackTrace();
}
```
The output format is composed of operator execution record.
Each record contains `name`(operator name), `dur`(time duration), `shape`(input shapes), `cpu mem`(cpu memory footprint).
```
{
"name": "aten::empty",
"ph": "X",
"ts": 24528.313000,
"dur": 5.246000,
"tid": 1,
"pid": "CPU Functions",
"shape": [[], [], [], [], [], []],
"cpu mem": "0 b",
"args": {}
}
```
1 change: 1 addition & 0 deletions docs/mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,7 @@ nav:
- 'docs/development/memory_management.md'
- 'docs/development/inference_performance_optimization.md'
- 'docs/development/benchmark_with_djl.md'
- 'docs/development/profiler.md'
- DJL Community:
- 'docs/forums.md'
- 'leaders.md'
Expand Down

0 comments on commit 0ec8bb1

Please sign in to comment.