Skip to content

Commit

Permalink
Notebook fixes (#5)
Browse files Browse the repository at this point in the history
* Minor notebook fixes

* Text fixes

* Update Ryzen AI Software stack

* Typos

* Consistency

* Consistency

* Consistency
  • Loading branch information
mariodruiz authored Jan 19, 2024
1 parent b5c1c8c commit 06103e8
Show file tree
Hide file tree
Showing 10 changed files with 32 additions and 43 deletions.
2 changes: 1 addition & 1 deletion notebooks/3_1_Color_threshold_example.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"<center><img src=\"./images/png/system_architecture.png\" style=\"max-height: 450px; width:auto; height:auto;\"></center>\n",
"<center><img src=\"./images/png/ryzen_ai_labels.png\" style=\"max-height: 450px; width:auto; height:auto;\"></center>\n",
"<center><strong>Ryzen 7040 connected to system memory, webcam and display</strong></center>"
]
},
Expand Down
12 changes: 6 additions & 6 deletions notebooks/3_2_Ryzenai_capabilities.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@
"source": [
"## Color Threshold dataflow graph\n",
"\n",
"Returning to the color threshold example; this is one of the simplest examples we can build for the Ryzen AI NPU. Its dataflow graph and the corresponding mapping to the NPU column are shown below. (Dataflow graphs were introduced in the [Riallto overview video](https://www.riallto.ai/ryzenai_video_overview.html)) in section 1. As a reminder, a dataflow graph is a graphical representation of a computation or a workflow, where nodes represent operations or tasks, and edges or connections between nodes represent the flow of data between these operations.\n",
"Returning to the color threshold example; this is one of the simplest examples we can build for the Ryzen AI NPU. Its dataflow graph and the corresponding mapping to the NPU column are shown below. (Dataflow graphs were introduced in the [Riallto overview video](https://www.riallto.ai/ryzenai_video_overview.html)). As a reminder, a dataflow graph is a graphical representation of a computation or a workflow, where nodes represent operations or tasks, and edges or connections between nodes represent the flow of data between these operations.\n",
"\n",
"The color threshold example's dataflow graph has a single node connected via memory buffers to its input and output. In general, each node in a dataflow graph is associated with one or more programs or subprograms that it will execute for a given application to run successfully. The programs are referred to throughout these notebooks as _software kernels_ or simply _kernels_. The color threshold application has one node and one kernel.\n",
"\n",
Expand Down Expand Up @@ -283,7 +283,7 @@
"source": [
"#### Compute tile Data movers\n",
"\n",
"Compute tiles have data movers that complement the nearest neighbor interfaces to move allow data to be moved from the compute tile to anywhere in the array.\n",
"Compute tiles have data movers that complement the nearest neighbor interfaces allowing data to be moved from the compute tile to anywhere in the array.\n",
"\n",
"Each compute tile has 4 data movers. 2 data movers are associated with input data streams and 2 are associated with output data streams."
]
Expand All @@ -300,8 +300,8 @@
"\n",
"| Communication | Load | Store |\n",
"|-----------------|---------------|---------------|\n",
"| Neighboring | 512-bits/cycle | 256-bits/cycle |\n",
"| Non-neighboring | 64-bits/cycle | 64-bits/cycle |"
"| Neighboring | 512-bit/cycle | 256-bit/cycle |\n",
"| Non-neighboring | 64-bit/cycle | 64-bit/cycle |"
]
},
{
Expand Down Expand Up @@ -408,7 +408,7 @@
"\n",
"### Leftmost column is different\n",
"\n",
"In the Ryzen AI NPU, every column typically has 1 interface tile. We say 'typically' because the leftmost column does not have an interface tile, as shown below. This \"irregularity\" is a silicon implementation detail we have ignored until now and can continue to ignore for the remainder of this tutorial. You can see this asymmetry in the image below. The two leftmost columns of compute and memory tiles can be used as a group with the single interface tile at the bottom of the second column.\n",
"In the Ryzen AI NPU, every column typically has 1 interface tile. We say 'typically' because the leftmost column does not have an interface tile. This \"irregularity\" is a silicon implementation detail we have ignored until now and can continue to ignore for the remainder of this tutorial. You can see this asymmetry in the image above. The two leftmost columns of compute and memory tiles can be used as a group with the single interface tile at the bottom of the second column.\n",
"\n",
"\n",
"---"
Expand Down Expand Up @@ -449,7 +449,7 @@
"\n",
"\n",
"Multiple different applications can run on the NPU concurrently. For example, each of the Windows Studio Effects we saw earlier are separate applications that run concurrently on the NPU in different columns. You can also run two or more instances of the same application concurrently, each instance processing different streams of data simultaneously.\n",
" \n",
"\n",
"---"
]
},
Expand Down
32 changes: 16 additions & 16 deletions notebooks/3_3_Scaled_color_threshold_example.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -126,12 +126,12 @@
"The example has sixteen widgets. You can use these to control each threshold value independently for the red, blue and green channels in each of the stripes. The \"type\" widget allows you to select the threshold operation type. You can select: \n",
"\n",
"* BINARY - pixels with intensity values greater than the threshold are set to a specified maximum value (255), and pixels with intensity values less than or equal to the threshold are set to zero.\n",
"* TRNUC (truncate) - if a pixel value is greater than the threshold, it is replaced with the threshold value; otherwise, it remains the same.\n",
"* TRUNC (truncate) - if a pixel value is greater than the threshold, it is replaced with the threshold value; otherwise, it remains the same.\n",
"* TOZERO - if a pixel value is less than the threshold, it is set to zero; otherwise, it remains the same. \n",
"\n",
"The _INV variants of BINARY and TOZERO will invert the operation.\n",
"\n",
"Experiment with the widgets to see how different threshold levels on the different channels effects the video output."
"Experiment with the widgets to see how different threshold levels on the different channels affects the video output."
]
},
{
Expand Down Expand Up @@ -210,20 +210,20 @@
"\n",
"In all, the application uses:\n",
"\n",
"* 1 interface tile.\n",
" * 2 data movers.\n",
" * 1 for stream input and 1 for stream output.\n",
"* 1 memory tile.\n",
" * 1 input data buffer.\n",
" * 1 output data buffers.\n",
" * 10 data movers.\n",
" * 1 for stream input and 1 for stream output to the interface tile.\n",
" * 4 for stream outputs to the compute tiles and 4 for stream inputs to the compute tiles.\n",
"* 4 compute tiles.\n",
" * 1 for each of the kernels.\n",
" * 1 input and output memory buffer in the data memory of each tile.\n",
" * 2 data movers in each tile.\n",
" * 1 for stream input and 1 for stream output.\n"
"* 1 interface tile\n",
" * 2 data movers\n",
" * 1 for stream input and 1 for stream output\n",
"* 1 memory tile\n",
" * 1 input data buffer\n",
" * 1 output data buffers\n",
" * 10 data movers\n",
" * 1 for stream input and 1 for stream output to the interface tile\n",
" * 4 for stream outputs to the compute tiles and 4 for stream inputs to the compute tiles\n",
"* 4 compute tiles\n",
" * 1 for each of the kernels\n",
" * 1 input and output memory buffer in the data memory of each tile\n",
" * 2 data movers in each tile\n",
" * 1 for stream input and 1 for stream output\n"
]
},
{
Expand Down
12 changes: 6 additions & 6 deletions notebooks/3_4_Edge_detect_example.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -10,13 +10,13 @@
"\n",
"## Goals\n",
"\n",
"* Introduction to the data memories of the compute tiles.\n",
"* Introduction to the data memories of the compute tiles\n",
"\n",
"* Explore dataflow between neighboring compute tiles via task level parallelism.\n",
"* Explore dataflow between neighboring compute tiles via task level parallelism\n",
"\n",
"* Understand how ping-pong buffering is implemented with data memory banks to optimize dataflow between neighboring tiles.\n",
"* Understand how ping-pong buffering is implemented with data memory banks to optimize dataflow between neighboring tiles\n",
"\n",
"* Learn how to achieve overlapped compute and data movement.\n",
"* Learn how to achieve overlapped compute and data movement\n",
"\n",
"## References\n",
"\n",
Expand Down Expand Up @@ -308,7 +308,7 @@
"\n",
"* The compute tile in the top left corner of the NPU array can access 1 local data memory and 1 more in the neighboring tile to the south.\n",
"\n",
"* The compute tile in the bottom left corner of the NPU array can access 1 local data memory and 1 more in the neighboring tile to the north\n",
"* The compute tile in the bottom left corner of the NPU array can access 1 local data memory and 1 more in the neighboring tile to the north.\n",
"* There are $2$ tiles that can access 2 local data memories.\n",
"\n",
"These tiles are colored <span style=\"color:#CC79A7; font-weight: bold;\">pink</span> in the diagram below.\n",
Expand Down Expand Up @@ -384,7 +384,7 @@
"\n",
"### Basic operation of ping-pong buffers\n",
"\n",
"The operation of ping-pong buffers can be understood as follows: As one buffer is actively written to, it gradually fills up with data. Simultaneously, the other buffer is being read from, causing it to empty. When the write buffer becomes full at the same time the read buffer becomes empty, they switch roles. The formerly empty buffer becomes the new write buffer and starts to accumulate data, while the previously full buffer becomes the read buffer and begins to be emptied.\n",
"The operation of ping-pong buffers can be understood as follows: as one buffer is actively written to, it gradually fills up with data. Simultaneously, the other buffer is being read from, causing it to empty. When the write buffer becomes full at the same time the read buffer becomes empty, they switch roles. The formerly empty buffer becomes the new write buffer and starts to accumulate data, while the previously full buffer becomes the read buffer and begins to be emptied.\n",
"\n",
"This mechanism allows both processes to access the buffer concurrently, with one writing while the other reads. Ping-pong buffering effectively minimizes latency and enhances data throughput, making it particularly useful in real-time applications."
]
Expand Down
2 changes: 1 addition & 1 deletion notebooks/3_5_Color_detect_example.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -166,7 +166,7 @@
"metadata": {},
"source": [
"- Step 1\n",
" * the interface tile moves a column of data from system memory to the data memory in the memory tile\n",
" * the interface tile moves a column of data from system memory to the data memory in the memory tile.\n",
"- Step 2\n",
" * in the memory tile, data movers will multicast the input stream to both the `rgb2hsv` kernel (orange, in the top compute tile) and `mask` kernel (dark orange, in the bottom compute tile). One data mover in the memory tile is used to send the data. Each compute tile uses a data mover to receive the stream. The stream switch in the bottom compute tile moves data (green) to the data memory in this tile and also sends the same data on the streaming interface to the north towards the top compute tile.\n",
"- Step 3\n",
Expand Down
2 changes: 1 addition & 1 deletion notebooks/4_2_write_your_kernel.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -132,7 +132,7 @@
"\n",
"The pre-defined `RGB720pBuilder` is a mapped dataflow graph that uses an interface tile and a single compute tile in an NPU column. We say \"mapped\" here because the application is mapped to specific tiles in the column - in this case the template is configured to use the bottom compute tile in a column. The interface tile data movers are pre-configured to handle _720p_ RGBA images (1280x720 pixels). They pass the data to the first compute tile and move the resulting data from the compute tile back to external memory. \n",
"\n",
"The application will process an entire row of the image at a time. This is the tiling process we described in an earlier notebook. Each RGBA pixel (Red, Green, Blue, Alpha) is 32-bits or 4-bytes and there are 1,280 pixels in a row. The data movers are pre-configured for transfers of 4 x 1,280 = 5,120 bytes. \n",
"The application will process an entire row of the image at a time. This is the tiling process we described in an earlier notebook. Each RGBA pixel (Red, Green, Blue, Alpha) is 32-bit or 4-Byte and there are 1,280 pixels in a row. The data movers are pre-configured for transfers of 4 x 1,280 = 5,120 bytes. \n",
"\n",
"The dataflow graph for this application is illustrated below, along with its mapping to a section of an NPU column. Note again that only one compute tile is used for this application and that this is the compute tile at the bottom of an NPU column. For simplicity, the three empty compute tiles in the rest of the column are not shown here. \n",
"\n",
Expand Down
2 changes: 1 addition & 1 deletion notebooks/5_1_pytorch_onnx_inference.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -292,7 +292,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Run the next cell to set up the `XLNX_VART_FIRMWARE` environmental variable to point to the NPU binary. The NPU binary `1x4.xclbin` is an AI design that provides up to 2 TOPS performance. up to four such AI streams can be run in parallel on the NPU without any visible loss of performance."
"Run the next cell to set up the `XLNX_VART_FIRMWARE` environmental variable to point to the NPU binary. The NPU binary `1x4.xclbin` is an AI design that provides up to 2 TOPS performance. Up to four such AI streams can be run in parallel on the NPU without any visible loss of performance."
]
},
{
Expand Down
11 changes: 0 additions & 11 deletions notebooks/5_2_pytorch_onnx_re-train.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -426,17 +426,6 @@
"## Step 4: Quantize the Model"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<div class=\"alert alert-box alert-info\"> \n",
" \n",
"Before running the next steps, you need to install the [Vitis AI ONNX Quantization](https://ryzenai.docs.amd.com/en/latest/inst.html#install-quantizer) tool.\n",
"\n",
"</div>"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down
Binary file modified notebooks/images/png/ryzen-ai-sdk.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed notebooks/images/png/system_architecture.png
Binary file not shown.

0 comments on commit 06103e8

Please sign in to comment.