-
Notifications
You must be signed in to change notification settings - Fork 673
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
2 changed files
with
56 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,55 @@ | ||
## Profiler (Experimental) | ||
|
||
Currently, DJL supports experimental profilers for developers that | ||
investigate the performance of operator execution as well as memory consumption. | ||
As we are still working in progress on the feature, different engines have different APIs and produce different output format. | ||
In the future, we are considering to design a unified APIs and output unified format. | ||
|
||
### MXNet | ||
|
||
By setting the following environment variable, it generates `profile.json` after executing the code. | ||
``` | ||
export MXNET_PROFILER_AUTOSTART=1 | ||
``` | ||
You can view it in a browser using trace consumer like `chrome://tracing `. Here is a snapshot that shows the sample output. | ||
![img](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/tutorials/python/profiler/profiler_output_chrome.png) | ||
|
||
### PyTorch | ||
|
||
DJL have integrated PyTorch C++ profiler API and expose `JniUtils.startProfile` and `JniUtils.stopProfile(outputFile)` Java APIs. | ||
`JniUtils.startProfile` takes `useCuda(boolean)`, `recordShape(boolean)` and `profileMemory(boolean)` arguments respectively. | ||
`useCuda` indicates if profiler enables timing of CUDA events using the cudaEvent API. | ||
`recordShape` indicates if information about input dimensions will be collected or not. | ||
`profileMemory` indicates if profiler report memory usage or not. | ||
`JniUtils.stopProfile` takes a outputFile of String type. | ||
|
||
Wrap the code snippet you want to profile in between `JniUtils.startProfile` and `JniUtils.stopProfile`. | ||
Here is an example | ||
``` | ||
try (ZooModel<Image, Classifications> model = ModelZoo.loadModel(criteria)) { | ||
try (Predictor<Image, Classifications> predictor = model.newPredictor()) { | ||
Image image = ImageFactory.getInstance() | ||
.fromNDArray(manager.zeros(new Shape(3, 224, 224), DataType.UINT8)); | ||
JniUtils.startProfile(false, true, true); | ||
predictor.predict(image); | ||
JniUtils.stopProfile(outputFile); | ||
} catch (TranslateException e) { | ||
e.printStackTrace(); | ||
} | ||
``` | ||
The output format is composed of operator execution record. | ||
Each record contains `name`(operator name), `dur`(time duration), `shape`(input shapes), `cpu mem`(cpu memory footprint). | ||
``` | ||
{ | ||
"name": "aten::empty", | ||
"ph": "X", | ||
"ts": 24528.313000, | ||
"dur": 5.246000, | ||
"tid": 1, | ||
"pid": "CPU Functions", | ||
"shape": [[], [], [], [], [], []], | ||
"cpu mem": "0 b", | ||
"args": {} | ||
} | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters