Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Project 4: Ricky Rajani #15

Open
wants to merge 20 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
85 changes: 78 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,91 @@
CUDA Rasterizer
===============

[CLICK ME FOR INSTRUCTION OF THIS PROJECT](./INSTRUCTION.md)

**University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 4**

* (TODO) YOUR NAME HERE
* Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab)
* Ricky Rajani
* Tested on: Windows 7, i7-6700 @ 3.40GHz 16GB, NVIDIA Quadro K620 (Moore 100C Lab)

This project implements a simplified rasterized graphics pipeline, similar to the OpenGL pipeline, using CUDA.

### Core Features:
- Vertex assembly
- Vertex shading
- Primitive assembly
- Depth test
- Rasterization
- Race avoidance using atomic function
- Fragment shading with lambertian shading
- Framebuffer

### Extra Features:
- Support for rasterizing additional primitives: lines and points
- UV texture mapping with bilinear texture filtering and perspective correct texture coordinates

### Flags:
These flags can be found at the top of ```rasterize.cu```
- ```LIGHTING``` - Enables lambertian shading
- ```TEXTURE``` - Enables UV texture mapping, models are white by default
- ```BILINEAR``` - Enables bilinear texture filtering when ```TEXTURE``` is enabled
- ```PERSPECTIVE``` - Enables perspective correction when ```TEXTURE``` is enabled
- ```POINTS``` - Enables points instead of triangle primitives
- ```POINTCLOUD``` - Sparsity of points
- ```LINE``` - Enables lines instead of triangle primitives

# Samples

#### Demos of scenes using basic rasterization pipeline using Lambertian shading

Cow | Duck
:-------------------------------: | :-------------------------------:
![](renders/cow_normal.PNG) | ![](renders/duck_normal.PNG)

Engine | Truck
:-------------------------------: | :-------------------------------:
![](renders/engine_normal.PNG) | ![](renders/truck_normal.PNG)

### Demos of scenes using additional primitives
Duck | Cow
:-------------------------------: | :-------------------------------:
![](renders/duck_points_10.PNG) | ![](renders/cow_points_10.PNG)

Duck | Cow | Truck
:-------------------------------: | :-------------------------------: | :-------------------------------:
![](renders/duck_lines.PNG) | ![](renders/cow_lines.PNG) | ![](renders/truck_lines.PNG)


![](renders/rasterize-graph.PNG)

The points have a step size of 50, so it is understandable that there is not as great a performance hit when using point primitives as it requires less iterations than using line primitives.

#### Demos of scenes using UV texture mapping

Duck | Truck
:-------------------------------: | :-------------------------------:
![](renders/duck_texture.PNG) | ![](renders/truck_texture.PNG)


Checkerboard | Checkerboard with Bilinear Filtering | Checkerboard with Perspective Correction
:-------------------------------: | :-------------------------------: | :-------------------------------:
![](renders/checkerboard-normal.PNG) | ![](renders/checkerboard-bilinear.PNG) | ![](renders/checkerboard-perspective.PNG)

Bilinear filtering is an antialiasing technique which creates smoother edges on the checkerboard compared to UV texture mapping without the filtering. In this method the four nearest texels to the pixel center are sampled, and their colors are combined by weighted average according to distance. This removes the 'blockiness' seen during magnification, as there is now a smooth gradient of color change from one texel to the next. After profiling ```kernTextureMap``` when biliniear texture filtering is turned off and on, it seems that there is a performance hit. Without the filtering the kernel takes 0.88 ms for each iteration and 1.02 ms with filtering.

The perspective correction fixed the distortion that was occuring. As can be seen from the charts below, it greatly reduced the FPS causing a non-trivial performance hit. Likewise, the same occurs with bilinear texture filtering, but the performance hit is not as great as perspective correction.

# Performance Analysis

![](renders/fps_graph.PNG)

### (TODO: Your README)
![](renders/pipeline_timing_graph.PNG)

*DO NOT* leave the README to the last minute! It is a crucial part of the
project, and we will not be able to grade you without a good README.
It is important to note that the number of primitives in a scene does not have a direct effect on the time spent in each pipeline stage, specifically the rasterization stage which consumes the most amount of time. Examining the charts above with various scenes, the cow has over 5000 primitives and the rasterization stage takes a third of the total time, whereas the box only has 12 primitives but the rasterization stage takes up most of the total time. An explanation for this could be the size of the primitives. Each kernel iterates through an entire bounding box; therefore, if a primitive has a large size then its bounding box will be bigger and there is more work each kernel must do during each iteration. This may lead to a considerable performance hit.

*CMakeLists modified to include 'common.h' for recording performance time

### Credits

* [tinygltfloader](https://github.com/syoyo/tinygltfloader) by [@soyoyo](https://github.com/syoyo)
* [glTF Sample Models](https://github.com/KhronosGroup/glTF/blob/master/sampleModels/README.md)
* [Bilineary texture filtering](https://en.wikipedia.org/wiki/Bilinear_interpolation)
* [Perspective correctness](https://en.wikipedia.org/wiki/Texture_mapping#Perspective_correctness)
Binary file added renders/checkerboard-bilinear.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added renders/checkerboard-normal.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added renders/checkerboard-perspective.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added renders/cow_lines.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added renders/cow_normal.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added renders/cow_points.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added renders/cow_points_10.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added renders/duck_lines.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added renders/duck_normal.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added renders/duck_points.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added renders/duck_points_10.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added renders/duck_texture.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added renders/engine_normal.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added renders/fps_graph.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added renders/pipeline_timing_graph.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added renders/rasterize-graph.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added renders/truck_lines.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added renders/truck_normal.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added renders/truck_points.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added renders/truck_texture.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions src/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
set(SOURCE_FILES
"common.h"
"rasterize.cu"
"rasterize.h"
"rasterizeTools.h"
Expand Down
112 changes: 112 additions & 0 deletions src/common.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
#pragma once

#include <cuda.h>
#include <cuda_runtime.h>

#include <cstdio>
#include <cstring>
#include <cmath>
#include <algorithm>
#include <chrono>
#include <stdexcept>

namespace StreamCompaction {
namespace Common {
__global__ void kernMapToBoolean(int n, int *bools, const int *idata);

__global__ void kernScatter(int n, int *odata,
const int *idata, const int *bools, const int *indices);

/**
* This class is used for timing the performance
* Uncopyable and unmovable
*
* Adapted from WindyDarian(https://github.com/WindyDarian)
*/
class PerformanceTimer
{
public:
PerformanceTimer()
{
cudaEventCreate(&event_start);
cudaEventCreate(&event_end);
}

~PerformanceTimer()
{
cudaEventDestroy(event_start);
cudaEventDestroy(event_end);
}

void startCpuTimer()
{
if (cpu_timer_started) { throw std::runtime_error("CPU timer already started"); }
cpu_timer_started = true;

time_start_cpu = std::chrono::high_resolution_clock::now();
}

void endCpuTimer()
{
time_end_cpu = std::chrono::high_resolution_clock::now();

if (!cpu_timer_started) { throw std::runtime_error("CPU timer not started"); }

std::chrono::duration<double, std::milli> duro = time_end_cpu - time_start_cpu;
prev_elapsed_time_cpu_milliseconds =
static_cast<decltype(prev_elapsed_time_cpu_milliseconds)>(duro.count());

cpu_timer_started = false;
}

void startGpuTimer()
{
if (gpu_timer_started) { throw std::runtime_error("GPU timer already started"); }
gpu_timer_started = true;

cudaEventRecord(event_start);
}

void endGpuTimer()
{
cudaEventRecord(event_end);
cudaEventSynchronize(event_end);

if (!gpu_timer_started) { throw std::runtime_error("GPU timer not started"); }

cudaEventElapsedTime(&prev_elapsed_time_gpu_milliseconds, event_start, event_end);
gpu_timer_started = false;
}

float getCpuElapsedTimeForPreviousOperation() //noexcept //(damn I need VS 2015
{
return prev_elapsed_time_cpu_milliseconds;
}

float getGpuElapsedTimeForPreviousOperation() //noexcept
{
return prev_elapsed_time_gpu_milliseconds;
}

// remove copy and move functions
PerformanceTimer(const PerformanceTimer&) = delete;
PerformanceTimer(PerformanceTimer&&) = delete;
PerformanceTimer& operator=(const PerformanceTimer&) = delete;
PerformanceTimer& operator=(PerformanceTimer&&) = delete;

private:
cudaEvent_t event_start = nullptr;
cudaEvent_t event_end = nullptr;

using time_point_t = std::chrono::high_resolution_clock::time_point;
time_point_t time_start_cpu;
time_point_t time_end_cpu;

bool cpu_timer_started = false;
bool gpu_timer_started = false;

float prev_elapsed_time_cpu_milliseconds = 0.f;
float prev_elapsed_time_gpu_milliseconds = 0.f;
};
}
}
Loading