Computer Vision
February 14, 2011
An implementation of the Canny Edge Detector. My detector computes the gradient of the Gaussian (using gaussgradient() below), pads the image with replications of the edge pixels, convolves the image with the gradient of the Gaussian, and returns the image and gradients to the original image size. It then computes the intensity and direction of the gradient at each point.
Once the intensity and the directions have been calculated, it runs the non-maximum suppression algorithm, which zeros out every pixel where the intensity is non-maximal (by following the direction of the gradient one pixel in each direction and testing for a larger intensity). Then, it performs hysteresis by stepping through each pixel; if the intensity is larger than T_h, it keeps that intensity and follows the paths perpendicular to the direction of the gradient until a pixel falls under T_l.
Once finished, the algorithm allows for brightening the edges to full white, and writing multiple files to disk or displaying the output.
Function that returns the gradient of the Gaussian function. Given an x and y coordinate and a sigma value, returns the partial derivative of the Gaussian with respect to x and with respect to y. Note: the x and y coordinates are centered around the origin.
Function that returns the next two pixels in a given direction: the next pixel in that direction and the next pixel in the opposite direction. Given an x and y coordinate, as well as a direction, this function computes the next (x,y) along that vector as well as the next (x,y) in the opposite direction.
Function that returns the next two pixels perpendicular to a given direction. Given an x and y coordinate, as well as a direction, this function computes the next pixel for each vector perpendicular to that direction.
My edge detector has a few user-selectable options:
Firstly, I created a test image, bigtest.jpg, which contains all white along with random black shapes, seen below:
Bigtest image
Then, I looked at the gradient of the Gaussian, with sigma = 2, to ensure that the gradient was produced correctly. Below are first the x-gradient matrix, followed by the y-gradient matrix:
Gradient of the Gaussian (x-direction)
Gradient of the Gaussian (y-direction)
This image produced the following x- and y-gradients, when the gradient of the Gaussian was convolved with the image:
Gradient of the smoothed image in the x direction
Gradient of the smoothed image in the y direction
This led me to know my algorithm was working correctly in finding the gradients and convolving them with the image. Then, I calculated the intensity and directions of each pixel (noting that I binned the directions to 0, 45, 90, -45, and -90).
Intensity of the gradient (before non-maximum suppression)
Direction of the gradient vector at each point
Using the intensity and direction, the algorithm computed the following results, first non-maximum suppression, then the final result after hysteresis. Also to note, I used the BRIGHTEN_EDGES option to set all maximums (and lines at the end) to full white. An interesting side effect is that under non-maximum suppression, the white space between the shapes also counted as maximums because the gradient ended up as .00017 instead of 0, which throws the algorithm off in certain places. This extra "noise" is fixed, however, when thresholds are applied in the hysteresis step.
Intensity of the gradient (after non-maximum suppression)
Non-maximum suppression final image (all edges elevated to 255)
Final image after Canny: sigma = 4, gradient matrix size = 40px, T_h = 9, T_l = 1
I also tested out the edge detector on other images, both from class and from my own Flickr account. The outputs are below (image, non-maximum suppression, and final Canny output). Mouse over to read sigma, gradient matrix size, and threshold values. Each image had to be tweaked to get the best output. I'll discuss choosing good values for these variables later.
Gradient of the smoothed image in the x direction
Gradient of the smoothed image in the y direction
Intensity of the gradient (before non-maximum suppression)
Direction of the gradient vector at each point
Intensity of the gradient (after non-maximum suppression)
Non-maximum suppression final image (all edges elevated to 255)
Final image after Canny: sigma = 3, gradient matrix size = 40px, T_h = 3, T_l = 2
Gradient of the smoothed image in the x direction
Gradient of the smoothed image in the y direction
Intensity of the gradient (before non-maximum suppression)
Direction of the gradient vector at each point
Intensity of the gradient (after non-maximum suppression)
Non-maximum suppression final image (all edges elevated to 255)
Final image after Canny: sigma = 1, gradient matrix size = 16px, T_h = 3, T_l = 1
Final image after Canny: sigma = 3, gradient matrix size = 32px, T_h = 3, T_l = 1
Final image after Canny: sigma = 1, gradient matrix size = 16px, T_h = 40, T_l = 10
Intensity of the gradient (after non-maximum suppression)
Non-maximum suppression final image (all edges elevated to 255)
Final image after Canny: sigma = 1, gradient matrix size = 16px, T_h = 17, T_l = 1
Non-maximum suppression final image (all edges elevated to 255)
Final image after Canny: sigma = 1, gradient matrix size = 16px, T_h = 5, T_l = 1
The initial algorithm I employed did not pad the image, so the conv2 function extended the image by padding it with 0s, causing some problems around the edges. That was remedied by padding the image with repetitions of the edge pixels. One of the more interesting phenomenon I have been noticing is that when producing the gradients of a completely white image, the magnitude was greater than 0, approximately .00017, which I would assume to be rounding errors. Needless to say, it does not affect the final output of either the edge nor corner detector. It only became apparent when I set the maximum values of the non-maximum suppression output to be full white, instead of their original intensity values.
With the edge detection algorithm, the time complexity could be as low as O(n^2), where the input image is nxn. To compute the intensity and direction of the gradient, perform non-maximum suppression, and perform hysteresis each requires visiting each pixel once, O(n^2). Non-maximum suppression requires testing two neighboring pixels at each step, but that is a constant overhead. Likewise, hysteresis is also O(n^2) since when a path is traversed from a given pixel, those pixels will not be visited again. The more complicated portion of the algorithm is the convolution step. Each portion of the convolution step (convolving with the x-gradient and y-gradient of the Gaussian) requires O(n^2*m^2), where m is the size of the Gaussian matrix. Since m is chosen by the user, with m set to be relatively small, the entire algorithm acts like O(n^2), however as m approaches n, convolution dominates the entire algorithm, which would result in a time complexity of O(n^4).
Each of the variables that are user-definable are interconnected in the Canny edge detection algorithm. First of all, the gradient matrix size must be set large enough to contain the entire gradient of the the Gaussian created by sigma. More specifically, if simply the Gaussian(x,y) was calculated for each point, the sum over the entire matrix must equal to 1. Secondly, sigma and the thresholds are related. As sigma increases, the thresholds must be lowered, since the gradients are smoothed out, lowering their overall maximum values. The low threshold, T_l, of the hysteresis step (when low) seemed to count much less than the high threshold, T_h, since the high threshold ignored chains without significantly intense points. As T_l increased to T_h, however, the algorithm starts ignoring all but the brightest points. Likewise, with a lower T_h, more and more chains are included.
As an example, let's consider the building.jpg image above. The image shown resulted from sigma set to 1, T_h as 3, and T_l as 1. These values were chosen since the resulting image outlines almost every brick on the building. As sigma increases (see above), the edges around the bricks blurs and those edges are no longer found, but instead, only the more distinct edges are found, such as those around the windows and between the building and the sky. A similar result is found with sigma=1 by increasing T_h (for example, to 40), which excludes the low intensity gradients along the bricks.
An implementation of the Corner Detector. My detector computes the gradient of the Gaussian (using gaussgradient() below), pads the image with replications of the edge pixels, convolves the input image with the gradient of the Gaussian, and returns the image and gradients to the original image size. It then computes the intensity and direction of the gradient at each point.
Once the intensity and the directions have been calculated, the covariance matrix and its minimum eigenvalue are computed for each window (of size WINDOW_SIZE). If that minimum eigenvalue is greater or equal to the THRESHOLD, then the point where the window is centered on as well as the minimum eigenvalue are added to a list of possible corners. That list is then sorted in descending order by eigenvalue. For each value in the list, from top to bottom, if any point below it falls into a window of the current point, its eigenvalue is zeroed out. This eliminates windows that overlap and results in windows centered on the largest minimum eigenvalue.
After computing all corners from the list, the algorithm then adds boxes to the output image around those corners to highlight them on the image.
Function that returns the gradient of the Gaussian function. Given an x and y coordinate and a sigma value, returns the partial derivative of the Gaussian with respect to x and with respect to y. Note: the x and y coordinates are centered around the origin.
My corner detector has a few user-selectable options:
Gradient of the smoothed image in the x direction
Gradient of the smoothed image in the y direction
Intensity of the gradient
Direction of the gradient vector at each point
Output of corner detector, with corners boxed in: sigma = 2, gradient matrix size = 32px, threshold = 1000, window size = 6px
Output of corner detector, with corners boxed in: sigma = 1, gradient matrix size = 16px, threshold = 500, window size = 6px
Output of corner detector, with corners boxed in: sigma = 2, gradient matrix size = 32px, threshold = 700, window size = 6px
Output of corner detector, with corners boxed in: sigma = 2, gradient matrix size = 32px, threshold = 700, window size = 6px
Output of corner detector, with corners boxed in: sigma = 2, gradient matrix size = 32px, threshold = 700, window size = 12px
The corner detection algorithm, while appearing simpler, runs much longer. In this algorithm, the choice of sigma and threshold define greatly the running time of the algorithm and the number of corners found. For example, in the first image of the building above, the corner detector was run with sigma=1 and a threshold of 500. It found many corners, including many of the bricks, but also found "erroneous" corners in the clouds. Simply increasing sigma to 2 and raising the threshold slightly to 700 reduced a great number of the points. With these values, only the most distinct corners on the building are detected, even though some clouds still exhibit the signs of a corner.
To note on running time, as the threshold decreases, more and more windows are added to the list of possible corners and their eigenvalues will have to be sorted and compared later. With a higher threshold, less windows are added and therefore less time will be spent in the sorting and comparing of eigenvalues.
Since I wanted close, small corner selections, I chose small windows to compute the covariance matrix over. For each of the examples above except for Bath, the window size was only 6 pixels. This was key in the bigtest.jpg test image, since a larger value for the window would have led to overlapping corners, therefore leaving some undetected. Bath was chosen to be 12px to give some variance in the results.