CLICK ME FOR INSTRUCTION OF THIS PROJECT
- Yi Guo
- Tested on: Windows 8.1, Intel(R) Core(TM)i5-4200M CPU @ 2.50GHz 8GB, NVIDIA GeForce 840M (Personal Notebook)
This is a rasterizer on GPU. The basic pipeline of vertex shader, primitive assembly, rasterization, fragment shader and frame buffer are implemented. Here are the images and performance analysis of the project.
I slight adjust the blinning parameter to make the material some how metallic.
duck.gltf (Blinn-Phong) | cow.gltf (Blinn-Phong) | CeisumMilkTruck.gltf (Blinn-Phong) |
Here are the normal debug mode
duckNormal.gltf | cowNormal.gltf | CeisumMilkTruckNormal.gltf |
I use the cuda timer to record the time of each process of the rasterization. Here are the graphs.
From the graph above, we can notice that the rasterization system spends most time on rasterization and fragment shading. The time cost of vertex shading and primitive assembly is related to the number of triangles of the primitive. However, it is not fair to compare the time cost of rasterization and fragment shading of different models based on the number of triangles they have. Since for the rasterization process, each thread renders a single triangle, so the changes on the number of triangles may not cause great difference on the time cost. What really makes a difference here is the time that each thread spends on rendering a single triangle. The graph below can be a good argument.
The graph above shows the time cost of a singel model(here I use the CeisumMilkTruckNormal.gltf) with different z values. The z values here mean the distance between the model and camera. As it shows, the time cost of rasterization and fragment shading will increase dramaticlly when the model get closer and closer to the camera. The reason is when the distance between camera and model decreases, each triangle in the primitive will take up more grids on the screen than before and in each thread we have to scan a larger range to render a triangle.
There are some better solutions for rasterizing a triangle. For each triangle, instead of scanning all the pixels in the bounding box, we can computer the intersection points of each row and triangle's sides and only render the pixel between 2 intersection points. I may implement this algorithm in the future to optimize the rasterization process.
I implment the Backface Culling for the rasterization system. Here is the comparison for the efficiency.
As the graph shows, backface culling may slightly increase the overall efficiency. BackFace culling fliters out the triangles we cannot see, but as we discussed above, the number of triangles is not the main factor of the time cost, thus it cannot cause great difference on the oveall time cost.
Supersampling is used to remove aliasing (jagged and pixelated edges). It is achived by rendering the image at a higher resolution than the one being displayed, then shrinking it to the desired size, using the extra pixels for calculation. The color of a pixel in the desired size image equals to the average color values of the corresponding pixels in the higher resolution image. Since we render a image with higher resolution, the rasterize process will be slower.
SSAA=1.gltf | SSAA=2.gltf | SSAA=4.gltf |
checkboard NoPerspective | checkboard Perspective |
No Bilinear Interpolation | Bilinear Interpolation |
For the line rendering, I use the Bresenham algorithm
duck.gltf (Blinn-Phong) | cow.gltf (Blinn-Phong) | CeisumMilkTruck.gltf (Blinn-Phong) |