An Image-Based Approach for Vehicle Detection


Detecting vehicles from a still image is an important problem in image processing application. Many methods have been developed; however, all methods contain some disadvantage. In this work a vehicle detection approach for detecting vehicles from static images is presented, which employs the use of the wheel of vehicles. Every vehicle has wheels; this wheel detector finds wheels and infers vehicle location from the wheel detection. Views from an omnidirectional camera are used to generate side view images. Edges of these images are calculated using well known canny edge detector. After finding the edges Hough circle detection algorithm is used to find circular shape of the images. Wheel candidates are chosen and tracked. Any wheel tracked in the foreground is chosen as wheel. Initial experiment results along with analysis are included.


The main objective of this work is to detect vehicles in still images which employ the use of the wheels of the vehicle.

The goal of this work is to improve the detection technique from the side view of vehiclesin still images.

The main contributions of this work are:

  • To modify the existing vehicle detection methods for achieving better quality vehicle detection output.
  • To apply edge detection algorithm and Hough circle detection algorithm in the side view of still vehicle images.

Originality of this work

This originality of this work is the modification of the vehicle detection methodologies proposed in. The improved output is achieved by using Canny edge detection algorithm and Hough circle detection algorithm.

Literature Review

Vehicle detection  is an important problem in many related applications, such as self-guided vehicles, driver assistance systems, intelligent parking systems, or measurement of traffic parameters, like vehicle count, speed, and flow. The most common approach to vehicle detection is using active sensors such as lasers or millimeter-wave radars. Prototype vehicles employing active sensors have shown promising results; however, active sensors have several drawbacks such as low resolution, may interfere with each other, and are rather expensive. Passive sensors on the other hand, such as cameras, offer a more affordable solution and can be used to track more effectively cars entering a curve or moving from one side of the road to another. Moreover, visual information can be very important in a number or related applications such as lane detection, traffic sign recognition, or object identification (e.g., pedestrians, obstacles).

One of most common approaches to vehicle detection is using vision-based [10] techniques to analyze vehicles from images or videos. However, due to the variations of vehicle colors, sizes, orientations, shapes, and poses, developing a robust and effective system of vision-based vehicle detection is very challenging. To address the above problems, different approaches using different features and learning algorithms [2-7] for locating vehicles have been investigated. For example, many techniques used background subtraction [17] to extract motion features for detecting moving vehicles from video sequences. However, this kind of motion feature is no longer usable and found in still images.

Previous works also have attempted to use PCA for front view classification of vehicles. Z. Sun et al. use support vector machines [7] after using a gaborfilterbank, template matching on rear view of the vehicles. M. Bertozzi et al. uses stereo correspondence matching to find vehicles. All of these do not use a side view, and most do not single out a single feature that they are trying to detect.

As mentioned before, vehicles have larger appearance variations including their colors, sizes, and shapes which will change according to their different viewing positions, lighting conditions, and cluttered background. All the variations will increase many difficulties and challenges in selecting a general feature to describe vehicles. In this paper, a novel wheel-based detection method to detect vehicles from still images is proposed. The goal is to use a specific ubiquitous feature to track on all cars, wheels. All car wheels are round and have similar texture. Also, they are all framed similarly with fenders on top and roadbed on bottom. The goal is to take advantage of this information to form a robust wheel detector that can be used as part of a vehicle detector. An application of this work is for object avoidance. The algorithm can detect a wheel in the blind spot, or whether a wheel is getting too close for comfort.

Organization of this thesis

Chapter one describes about the background and objectives of this work as well as the objectives, innovation and literature review about vehicle detection in still images are presented here.

Chapter two is about some image processing basics (like histogram, color model, contrast enhancement).

Chapter three is focused on describing methodologies concerned in this work. The new method is also proposed in this chapter.

Experimental result and discussions are in chapter four.

Finally chapter five draws the conclusion.

Digital Imaging

Digital imaging or digital image acquisition is the creation of digital images, typically from a physical scene. The term is often assumed to imply or include the processing, compression, storage, printing, and display of such images. The most usual method is by digital photography with a digital camera but other methods are also employed.

Digital Imaging Methods

A digital photograph may be created directly from a physical scene by a camera or similar device. Alternatively, a digital image may be obtained from another image in an analog medium, such as photographs, photographic film, or printed paper, by an image scanner or similar device. Many technical images—such as those acquired with tomographic equipment, side-scan sonar, or radio telescopes—are actually obtained by complex processing of non-image data. Weather radar maps as seen on television news are a commonplace example. The digitalization of analog real-world data is known as digitizing, and involves sampling (discretization) and quantization. Finally, a digital image can also be computed from a geometric model or mathematical formula. In this case the name image synthesis is more appropriate, and it is more often known as rendering.

Image representation

Represent an image as a 2D array. Indices represent the spatial location. Values represent light intensity.

Digital Image Processing

Any 2D mathematical function that bears information can be represented as an image. A digital image is an array of real or complex numbers represented by a finite number of bits. Digital image processing generally refers to processing of a 2D picture by a digital computer.

Digital image processing focuses on two major tasks

  • Improvement of pictorial information for human interpretation
  • Processing of image data for storage, transmission and representation for autonomous machine perception

digital image

Image Representation and modeling

The goal of image modeling or representation is to find proper ways to mathematically describe and analyze images. It is therefore the most fundamental step in image processing. An image could represent luminance of objects in a scene (image taken by camera), the absorptioncharacteristics of the body tissue or material particles (X-ray imaging), radar cross-section of a target (radar imaging), the temperature profile of a region (infrared imaging), the gravitational field in an area (geophysical imaging).

Image Model

An image can be represented by a matrix U where each element ui,jfor 0 ≤ i ≤ N (row) and 0 ≤ j ≤ M (column) are called the picture elements or pixels.

The image resolution is the size MxN (width x height) in pixels. The spatial resolution is the size covered by a pixel in the real world.

Image Coordinate System

Image coordinate system is conventionally defined as –

Types of Digital Image

The images types we will consider are: 1) binary, 2) gray-scale, 3) color, and 4) multispectral.

  • Binary Images

                 Binary images are the simplest type of images and can take on two values, typically black and white, or 0 and 1. A binary image is referred to as a 1-bit image because it takes only 1 binary digit to represent each pixel. These types of images are frequently used in applications where the only information required is general shape or outline, for example optical character recognition (OCR).

                 Binary images are often created from the gray-scale images via a threshold operation, where every pixel above the threshold value is turned white (‘1’), and those below it are turned black (‘0’).

  • Gray-scale images

                 Gray-scale images are referred to as monochrome (one-color) images.

They contain gray-level information, no color information. The number of bits for each pixel determines the number of different gray levels available. The typical gray-scale image contains 8bits/pixel data, which allows us to have 256 different gray levels. In applications like medical imaging and astronomy, 12 or 16bits/pixel images are used. These extra gray levels become useful when a small section of the image is made much larger to discern details.

  • Gray-scale images

                 Gray-scale images are referred to as monochrome (one-color) images.

They contain gray-level information, no color information. The number of bits for each pixel determines the number of different gray levels available. The typical gray-scale image contains 8bits/pixel data, which allows us to have 256 different gray levels. In applications like medical imaging and astronomy, 12 or 16bits/pixel images are used. These extra gray levels become useful when a small section of the image is made much larger to discern details.

The left image appears washed-out (most of the intensities are in a narrow band due to poor contrast).  The right image maps those values to the full available dynamic range.

Image Histogram

  • The histogram of an image is a table containing (for every gray level K) the probability of level K actually occurring in the image
  • The histogram could also be viewed as a frequency distribution of gray level within the image.

Color and Color Model

Color is the visual perceptual property corresponding in humans to the categories called red, green, blue and others. Color derives from the spectrum of light interacting in the eye with the spectral sensitivities of the light receptors. A color model is an abstract mathematical model describing the way colors can be represented as tuples of numbers, typically as three or four values or color components.

Color representation

There are the three primary colors of red, yellow and blue. Then there are secondary colors of green, orange and purple. Additionally, there are tertiary colors that are combinations of the first two sets.

Colors and Electromagnetic Spectrum

Wavelength of visible light: 350-789mm

RED = 700nm, GREEN = 546.1nm and BLUE = 435.5nm

Three Colors Theory

Thomas Young (1802) stated that any color can be reproduced by mixing an appropriate set of three primary colors. Light source uses additive color models. Light absorption uses subtractive color models. The HVS uses three kinds of cones with response peak in the yellow-green, the green and the blue regions with significant overlap. The human eye cannot resolve the components of a color mixture; therefore monochromatic colors are not unique for the HVS. The HVS is sensitive to dozens of grey levels and thousands of colors.

Color Models

  • Color models attempt to mathematically describe the way that humans perceive color
  • The human eye combines 3 primary colors (using the 3 different types of cones) to discern all possible colors.
  • Colors are just different light frequencies
    • red – 700nm wavelength
    • green – 546.1 nm wavelength
    • blue – 435.8 nm wavelength
  • Lower frequencies are cooler colors

Primary Colors

  • Primary colors of light are additive
  • Primary colors are red, green, and blue
  • Combining red + green + blue yields white
  • Primary colors of pigment are subtractive
  • Primary colors are cyan, magenta, and yellow
  • Combining cyan + magenta + yellow yields black

RGB color model

The RGB color model is an additive color model in which red, green, and blue light are added together in various ways to reproduce a broad array of colors. The name of the model comes from the initials of the three additive primary colors, red, green, and blue.The main purpose of the RGB color model is for the sensing, representation, and display of images in electronic systems, such as televisions and computers, though it has also been used in conventional photography. Before the electronic age, the RGB color model already had a solid theory behind it, based in human perception of colors.

CMY color model

It is possible to achieve a large range of colors seen by humans by combining cyan, magenta, and yellow transparent dyes/inks on a white substrate. These are the subtractive primary colors. Often a fourth black is added to improve reproduction of some dark colors. This is called “CMY” or “CMYK” color space.

The cyan ink absorbs red light but transmits green and blue, the magenta ink absorbs green light but transmits red and blue, and the yellow ink absorbs blue light but transmits red and green. The white substrate reflects the transmitted light back to the viewer. Because in practice the CMY inks suitable for printing also reflect a little bit of color, making a deep and neutral black impossible, the K (black ink) component, usually printed last, is needed to compensate for their deficiencies. The dyes used in traditional color photographic prints and slides are much more perfectly transparent, so a K component is normally not needed or used in those media.

YIQ Color Model

  • Luminance (Y), In phase (I), and Quadrature (Q)
  • Used for TV broadcasts – backward compatible with monochrome TV standards
  • Luminance is BW component
  • Human visual system is more sensitive to changes in intensity than in color.
  • In NTSC, bandwidth allocation of YIQ is 4MHz, 1.5 MHz, and 0.6 MHz respectively.

HSI Color Model

Based on human perception of colours,Colour is “decoupled” from intensity.

  • HUE

–     A subjective measure of colour.

–     Average human eye can perceive ~200 different colours

  • Saturation

–     Relative purity of the colour.  Mixing more “white” with a colour reduces its saturation.

–     Pink has the same hue as red but less saturation

  • Intensity

–     The brightness or darkness of an object

In color image processing, RGB images are often converted to HSI and then the I component is manipulated.  The image is then converted back to RGB.


Contrast is the difference in luminance and/or color that makes an object (or its representation in an image or display) distinguishable. In visual perception of the real world, contrast is determined by the difference in the color and brightness of the object and other objects within the same field of view. Because the human visual system is more sensitive to contrast than absolute luminance, we can perceive the world similarly regardless of the huge changes in illumination over the day or from place to place. The maximum contrast of an image is the contrast ratio or dynamic range.

Contrast is also the difference between the color or shading of the printed material on a document and the background on which it is printed, for example in optical character recognition.

Definitions of Image Contrast

There are many possible definitions of contrast. Some include color; others do not. Travnikova laments, “Such a multiplicity of notions of contrast is extremely inconvenient. It complicates the solution of many applied problems and makes it difficult to compare the results published by different authors.”

Various definitions of contrast are used in different situations. Here, luminance contrast is used as an example, but the formulas can also be applied to other physical quantities. In many cases, the definitions of contrast represent a ratio of the type

The rationale behind this is that a small difference is negligible if the average luminance is high, while the same small difference matters if the average luminance is low (see Weber–Fechner law). Below, some common definitions are given.

The Weber contrast is defined as withand representing the luminance of the features and the background luminance, respectively. It is commonly used in cases where small features are present on a large uniform background, i.e. the average luminance is approximately equal to the background luminance.

The Michelson contrast (also known as the Visibility) is commonly used for patterns where both bright and dark features are equivalent and take up similar fractions of the area. The Michelson contrast is defined as

withand representing the highest and lowest luminance. The denominator represents twice the average of the luminance.

Category of Contrast Enhancement Techniques

Category of contrast enhancement techniques includes –

1)      Direct method and indirect method.

2)      Global (Global Contrast Enhancement Technique) and Local technique.

Direct and Indirect Method

Direct method defines a contrast measure and improves it. Various definitions of contrast are used in different situations.

Indirect methods are based on Histogram Analysis that is used for controlling image contrast and brightness. Histogram modification techniques fall in this category. It exploits through the under-utilized regions of the dynamic range and don’t define a specific contrast term. This technique modifies the image through some pixel mapping.

Global and Local Technique

The contrast can be enhanced either globally or locally.

Global method (Global Contrast Enhancement) is a single mapping derived from the image is used.

Local method is the neighborhood of each pixel is used to obtain a local mapping function.

Basic Algorithm

The basic algorithm used for vehicle detection involves the following procedure:

First, we take still color images with side view of vehicle. We need some restrictions for good result, such as full side view of single vehicle. The full side view image is needed for accurate and effective result. The full side view image would like as follows one. 

The first step of this algorithm is to transform the image from color image to gray scale, namely we convert RGB color space to Gray Scale image using the following equation-

After that the edge detection algorithm is applied to the gray image to find the edge-map of the image. In this method canny edge detector is used to find the edges of the image. The result image after applying the edge detection on gray image is as follows-

Next step is to find the circular objects in the image. Hough circle detection algorithm is used to find the circles in the image. This circle detection algorithm is used, because every vehicle has wheel in each side of the vehicle and we are using side view image of vehicles. We’ll get several detected circle depending on the environment and the vehicle. These circles are possible candidate of vehicles wheels. Other environmental objects and shapes are also included. The result image after circle detection in edge detected image is as follows-

Next step is to find the vehicle wheels from the candidate circles. If there are only two circles then they are the possible wheels of the vehicle. For more than two circles we have to remove other circles based on some criteria. For this we pick two circles each time and calculate the radius difference, horizontal position and distance from each other. The radius difference of both circles should be minimum, as we know both wheel will have same radius. We use a threshold value for the difference. Pair of circles having radius difference greater threshold value will be rejected. Also the horizontal position of circles is measured. The Y-value of both circles should be minimum. We also use a threshold value for this case. Difference greater threshold value will be marked as rejected. And we also calculate the distance between the circles. We know that pair of wheels should not overlap each other and there is a minimum distance between them. Pair of circle candidates is selected based on the above criteria. These pair of circles is the most possible wheels. The result image after finding the best possible pair of circles is as following-

The flowchart of vehicle detection algorithm is given in below-


The above discussion specifies the vehicle detection algorithm. This algorithm is found highly accurate and efficient one if full side view image (such as the input image) is available. It is used the basic image processing and circle detection method. This algorithm is easy to implement.

Vehicle Detection Mechanism

This vehicle detection algorithm is implemented followed by wheel detection. As vehicle wheel can be detected in side view of image, vehicle detection mechanism is highly dependent on circle detection. In this section, we described circle detection technique at first and then vehicle detection mechanism is described.

Edge Detection

This algorithm highly based on the edge-mapped image. We can find the circular shape objects in the edge mapped image. Canny edge detection algorithm is used in this algorithm for better result. We find the edge image from gray image of the original input image. In some case we need to histogram equalized the gray scale image to get better result. Better edge-map image relies on sharp edges of the original image. The shadow and edges of vehicles make it better for successful vehicle detection.

Circle Detection

This algorithm is also based on circle detection on the input image. After getting the edge mapped image we apply Hough circle detection technique to find the circles in the image. We know that wheels of the vehicle must be circular in shape. As we are using side view images, this technique make this algorithm highly accurate and efficient. Also we know that the environment may contain other circular shape objects. We need to eliminate those objects. To eliminate those objects we apply some constraints on the detected circles based on the wheel positions on the vehicles. We get the possible pair of wheels in the side view image.

Experiment Result

The proposed method is implemented and tested in MATLAB R2012a. The computer is Intel Core i3 2.4GHz with 6GB RAM and Windows 7. The efficiency of the proposed method is evaluated by both visual and numerical inspection.

  • 20 car images (various car images from side view)
  • 30 motorcycle images (various motorcycle images captured from side view)

Analysis methods

  • Visual inspection (by expert)
  • Numerical Inspection  (using the calculation time and success rate)

20 car and 30 motorcycle images were analyzed and evaluated by both visual and numerical inspection for inspecting the efficiency of the proposed method.

For visual inspection of the vehicle images 20 car images and 30 motorcycle images were analyzed by expert to find the most accurate result. The resultant detected wheels are displayed over the original image. Result of the initial study is given in the following figures.

edge detected

The average result from this experiment is given below-

The radius search range provided here 10-100 for car and 20-100 for motorcycle. The threshold for radius difference is 3, horizontal distance is 5.

Vehicle type

Number of vehicle

Other circular object detected

Number of detected vehicle

Detection percentage











Table 1: Vehicle detection result


The proposed vehicle detection algorithm can be applied only in side view of vehicle images as wheels are exposed only in the side view of the vehicle images. Though there are many other circular objects in the environment, this method can successfully remove those unwanted objects. The success rate is quite impressive.


Vehicle detection is a complex and challenging task due to the complex nature of images. It is a preliminary step in the analysis of traffic monitoring and control system. In this work the preliminary step to vehicle detection is introduced by detecting wheel from a side view of vehicle images. Color images are converted to gray-scale images then edges of the images are calculated. By applying the Hough circle detection algorithm car wheels are detected from the edge maps of vehicle images. The wheel candidates are chosen from detected circles. The wheel candidates are then tracked. Initial findings show promising results; however, further work is required to evaluate the performance of the proposed vehicle detection method.

Achievement of this thesis

  • Better vehicle detection by detecting the wheels of the vehicles from the side view of vehicles.
  • Wheels can be detected in a variety of conditions and on a variety of vehicle types. Some conditions are problematic but better non-road model will improve them.

Initial finding show promising results; however, further work is required to evaluate the performance of the proposed vehicle detection method.


Of course as a human work this system also has limitations.

  1. The main limitation is that, it can detect the vehicle from the side view of a still image. The wheel should be exposed so that the wheels can be detected. Also wheel should be circular in shape.
  2. Calculation time is little high, so it can’t be applied in real time traffic monitoring. The cause of high calculation time is for various radius of the wheel of the vehicles.
  3. Various categories of vehicles cannot be identifiedby this time. Future work of this title should do the work.

Future work

The methodology proposed in this thesis can be further reviewed for vehicle detection purpose. Here are some good scopes or good challenges for future work on this system.

  1. The proposed methodology can be further extended to detect various kinds of vehicles.
  2. Detection of multiple vehicles in a single image will be a challenge as we have to differentiate the various wheels of various vehicles.
  3. Classification of vehicles can be possible by differentiating the wheel features of various vehicles and create a detection mechanism.

 imeg based