The goals / steps of this project are the following:
- Compute the camera calibration matrix and distortion coefficients given a set of chessboard images.
- Apply a distortion correction to raw images.
- Use color transforms, gradients, etc., to create a thresholded binary image.
- Apply a perspective transform to rectify binary image ("birds-eye view").
- Detect lane pixels and fit to find the lane boundary.
- Determine the curvature of the lane and vehicle position with respect to center.
- Warp the detected lane boundaries back onto the original image.
- Output visual display of the lane boundaries and numerical estimation of lane curvature and vehicle position.
A pipeline was produced to process the images in the video, which wlil be explained in the following steps.
The camera calibration was performed using the chessboard images in the camera calibration file. Using the cv2 function, the "object points",which will be the (x, y, z) coordinates of the chessboard corners in the world are detected. Here I am assuming the chessboard is fixed on the (x, y) plane at z=0, such that the object points are the same for each calibration image. Thus, objp
is just a replicated array of coordinates, and objpoints
will be appended with a copy of it every time I successfully detect all chessboard corners in a test image. imgpoints
will be appended with the (x, y) pixel position of each of the corners in the image plane with each successful chessboard detection.
I then used the output objpoints
and imgpoints
to compute the camera calibration and distortion coefficients using the cv2.calibrateCamera()
function. I applied this distortion correction to the test image using the cv2.undistort()
function and obtained this result:
The first step of the pipeline is to undistort incoming images. The image below is an example of the undistorted image.
The image was subsequently passed through a perspective transform to obtain the birds eye view of the road for lane detection. This is done by identifying a polygon in the raw image to be mapped to a rectangle in the transformed image. To simplify the process, an image in which the road was straight was used to, as the transform would be validated when a rectangle is seen in the final image.
I chose the hardcode the source and destination points. The exact parameters may be found in Code Block 14 of the project file.
I verified that my perspective transform was working as expected by drawing the src
and dst
points onto a test image and its warped counterpart to verify that the lines appear parallel in the warped image.
The next step in the pipeline is to generate a binary image based on the different thresholds across different colorspaces, to increase the contrast between lane lines and ground. Several colorspaces were explored and the experimentation may be found in the Project_working.ipynb code. Examples of these color spaces include using the Sobel Operator, RGB, HLS, YUV and grayscale.
The final lane-line detection combination was used by merging the results of the B-channel from the Lab colorspace, and L-channel, from LUV color space. The B-channel detects the yellow lane lines, while the L-channel detects the white lines.
The final color transformation is done via the function lanes_bw
The sliding window method was used to detect the lane pixels for the left and right lane lines. This is performed in the find_lanes function, which returns the polynomial coefficients of the left and right lane lines. It also returns the indices of the left and right points in the image.
The radius of curvature was computed in pixels, then converted to meters using a conversion factor of meters per pixel. This can be found in the get_rad_and_curv function. The radius of curvature of the left and right lane lines are computed, then averaged to obtain the radius of curvature at the center of the lane.
Assuming the camera is mounted in the center of the car, we would assume the center of the car to be positioned at the center of the image. However, after obtaining the position of the left and right lane lines, we are also able to obtain the position of the center of the lane. As such, the number of pixels between the center of the image and the lane center. By multiplying this number by the conversion factor, we are able to obtain the off-center distance of the car.
6. Provide an example image of your result plotted back down onto the road such that the lane area is identified clearly.
The resulting polynomial was then unwarped and superimposed onto the original image.
Here's a link to my video result
Here I'll talk about the approach I took, what techniques I used, what worked and why, where the pipeline might fail and how I might improve it if I were going to pursue this project further.
There were several difficulties faced while working on this project.
-
Hardcoding of the perspective transform. The perspective transform was hardcoded, meaning it is highly contingent on a fixed camera angle. As such, any slight shift in the camera angle could affect the perspective transform. To improve the robustness of the algorithm, the perspective transform should be calibrated on a per video basis. Also, calibration is best performed driving on a straight road, as it is difficult to obtain a trapezium on a curved road for the perspective transform.
-
Obtaining an optimal threshold for identifying the lane lines. The grey roads tended to affect the detection of lanes. In addition, yellow lane lines were particularly difficult to detect. As such, the HSV colorspace was used to detect the yellow lines, and YUV for white lines. Superimposing the results would produce a binary image that could detect both yellow and white lines.
-
Lane line detection: the pipeline would tend to crash should it not be able to detect lane lines on a particular side of the road. As such, the thresholds were loosened in order to allow for the detection of lane lines.
-
Using the cv2 and the ffmpeg library to read images resulted in different results. This is because cv2 has a default colorspace of BGR, while ffmpeg has a default colorspace of RGB. As such, it is imperative to ensure that the images fed through the pipeline are using a standardized colorspace, to prevent skewed results, despite having a working lane detection algorithm.