Feature Detection, Part 3: Harris Corner Detection

Feature detection is a domain of computer vision that focuses on using tools to detect regions of interest in images. A significant aspect of most feature detection algorithms is that they do not employ machine learning under the hood, making the results more interpretable and even faster in some cases.

In the previous two articles of this series, we looked at the most popular operators for detecting image edges: Sobel, Scharr, Laplacian, along with the Gaussian used for image smoothing. In some form or another, these operators used under-the-hood image derivatives and gradients, represented by convolutional kernels.

As with edges, in image analysis, another type of local region is often explored: corners. Corners appear more rarely than edges and usually indicate a change of border direction of an object or the end of one object and the beginning of another one. Corners are rarer to find, and they provide more valuable information.

Example

Imagine you are collecting a 2D puzzle. What most people do at the beginning is find a piece with an image part containing the border (edge) of an object. Why? Because this way, it is easier to identify adjacent pieces, since the number of pieces sharing a similar object edge is minimal.

We can go even further and focus on picking not edges but corners — a region where an object changes its edge direction. These pieces are even rarer than just edges and allow for an even easier search for other adjacent fragments because of their unique form.

For example, in the puzzle below, there are 6 edges (B2, B3, B4, D2, D3, and D4) and only 1 corner (C5). By picking the corner from the start, it becomes easier to localize its position because it is rarer than edges.

The goal of this article is to understand how corners can be detected. To do that, we will understand the details of the Harris corner detection algorithm – one of the simplest and popular methods developed in 1988.

Idea

Let us take three types of regions: flat, edge, and corner. We have already shown the structure of these regions above. Our objective will be to understand the distribution of gradients across these three cases.

During our analysis, we will also build an ellipse that contains the majority of the plotted points. As we will see, its form will provide strong indications of the type of region we are dealing with.

Flat region

A flat region is the simplest case. Usually, the entire image region has nearly the same intensity values, making the gradient values across the X and Y axes minor and centered around 0.

By taking the gradient points (Gₓ, Gᵧ) from the flat image example above, we can plot their distribution, which looks like below:

We can now construct an ellipse around the plotted points having a center at (0, 0). Then we can identify its two principal axes:

The major axis along which the ellipse is maximally stretched.
The minor axis along which the ellipse attains its minimum extent.

In the case of the flat region, it might be difficult to visually differentiate between the major and minor axes, as the ellipse tends to have a circular shape, as in our situation.

Nevertheless, for each of the two principal axes, we can then calculate the ellipse radiuses λ₁ and λ₂. As shown in the picture above, they are almost equal and have small relative values.

Edge region

For the edge region, the intensity changes only in the edge zone. Outside of the edge, the intensity stays nearly the same. Given that, most of the gradient points are still centered around (0, 0).

However, for a small part around the edge zone, gradient values can drastically change in both directions. From the image example above, the edge is diagonal, and we can see changes in both directions. Thus, the gradient distribution is skewed in the diagonal direction as shown below:

For edge regions, the plotted ellipse is typically skewed towards one direction and has very different radiuses λ₁ and λ₂.

Corner region

For corners, most of the intensity values outside the corners stay the same; thus, the distribution for the majority of the points is still located near the center (0, 0).

If we look at the corner structure, we can roughly think of it as an intersection of two edges having two different directions. For edges, we have already discussed in the previous section that the distribution goes in the same direction either in X or Y, or both directions.

By having two edges for the corner, we end up with two different point spectrums growing in two different directions from the center. An example is shown below.

Finally, if we construct an ellipse around that distribution, we will notice that it is larger than in the flat and edge cases. We can differentiate this result by measuring λ₁ and λ₂, which in this scenario will take much larger values.

Visualization

We have just seen three scenarios in which λ took different values. To better visualize results, we can construct a diagram below:

Diagram showing the relationship between values of λ and region types.

Formula

To be able to classify a region into one of three zones, a formula below is commonly used to estimate the R coefficient:

R = λ₁ ⋅ λ₂ – k ⋅ (λ₁ + λ₂)² , where 0.04 ≤ k ≤ 0.06

Based on the R value, we can classify the image region:

R
R ~ 0 – flat region
R > 0 – corner region

OpenCV

Harris Corner detection can be easily implemented in OpenCV using the cv2.CornerHarris function. Let’s see in the example below how it can be done.

Here is the input image with which we will be working:

First, let us import the necessary libraries.

import numpy as np
import cv2
import matplotlib.pyplot as plt

We are going to convert the input image to grayscale format, as the Harris detector works with pixel intensities. It is also necessary to convert the image format to float32, as computed values associated with pixels can exceed the bounds [0, 255].

path = 'data/input/shapes.png'
image = cv2.imread(path)
grayscale_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
grayscale_image = np.float32(grayscale_image)

Now we can apply the Harris filter. The cv2.cornerHarris function takes four parameters:

grayscale_image – input grayscale image in the float32 format.
blockSize (= 2) – defines the dimensions of the pixel block in the neighborhood of the target pixel considered for corner detection.
ksize (= 3) – the dimension of the Sobel filter used to calculate derivatives.
k (= 0.04) – coefficient in the formula used to compute the value of R.

R = cv2.cornerHarris(grayscale_image, 2, 3, 0.04)
R = cv2.dilate(R, None)

The cv2.cornerHarris function returns a matrix of the exact dimensions as the original grayscale image. Its values can be well outside the normal range [0, 255]. For every pixel, that matrix contains the R coefficient value we looked at above.

The cv2.dilate is a morphological operator that can optionally be used immediately after to better visually group the local corners.

A common technique is to define a threshold below which pixels are considered corners. For instance, we can consider all image pixels as corners whose R value is greater than the maximal global R value divided by 100. In our example, we assign such pixels to red color (0, 0, 255).

To visualize an image, we need to convert it to RGB format.

image[R > 0.01 * R.max()] = [0, 0, 255]
image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

Finally, we use maplotlib to display the output image.

plt.figure(figsize=(10, 8))
plt.imshow(image_rgb)
plt.title('Harris Corner Detection')
plt.axis('off')
plt.tight_layout()
plt.show()

Here is the result:

Output image. Red color indicates corners.

Conclusion

In this article, we have examined a robust method for determining whether an image region is a corner. The presented formula for estimating the R coefficient works well in the vast majority of cases.

In real life, there is a common need to run an edge classifier for an entire image. Constructing an ellipse around the gradient points and estimating the R coefficient each time is resource-intensive, so more advanced optimization techniques are used to speed up the process. Nevertheless, they are based a lot on the intuition we studied here.