illustration. Our work addresses only the latter case
and therefore requires a query input. Brown
proposed a very similar system for retrieval of
predefined familiar object colour. Her framework
accumulates a histogram of coloured pixels for a
small number of human perceived colours.
Parameterization of this discretization is performed
to determine the dominant colour of the object.
Swain provided the initial idea of colour recognition
based on colour histograms, which are matched by
histogram intersection. Modifications in this idea
contain improvements upon histogram
measurements, incorporating information about the
spatio-temporal relationships of the colour pixels.
Our method is also based on histogram binning
technique along with temporal accumulation of
results on various frames to enhance the accuracy of
true colour extraction.
Wui et al. addressed the task of colour
classification into pre-specified colours for tracked
objects. Weijer et al. and Zhang et al. proposed
probabilistic latent semantic analysis (PLSA) and
Co-PLSA based approaches for object colour
categorization in videos. These methods rely on
complex features like SIFT and MSER to articulate
the objects into various parts, i.e. tyres and windows
for vehicles, and separate them to reduce the effect
of their colour in categorization of vehicle’s main
colour. These methods require extensive processing
which make them less suitable for real time
applications.
Colour constancy has supreme importance for
accurate extraction of object colours in videos and
images. The most recent work on colour constancy
uses a Bayesian approach to solve for the
illumination conditions (Manduchi, 2006); (Shaefer
et al., 2005); (Tsin et al., 2001). Renno et al.
evaluated the advantages of two classic colour
constancy algorithms (grey world and gamut
mapping) for surveillance applications and found
both algorithms to improve colour with gamut
mapping resulted in small error than grey world.
However, we are relying on computational colour
constancy and hence using GW because of its
simplicity and least processing time among all CC
techniques.
3 METHODOLOGY
The process of object colour recognition is carried
out in four major steps; colour correction, colour
space conversion from RGB to HSV, pixel
clustering and fine tuning.
3.1 Colour Correction
The colour correction of input video frames is
carried out in two major steps. Conventional colour
constancy technique is followed by a set of
processing procedures to achieve true colours of the
object present in the video frames.
Colour constancy (CC): Colour constancy is
extremely important to reduce the effect of
illumination and surroundings. It is impossible for a
colour recognition system to perform well without
colour constancy. We took advantage of existing
techniques and tested computational algorithms on a
large number of CCTV videos. Grey World, Max
RGB and Grey Edge algorithms found to perform
approximately similar. We decided to use Grey
World because of its less computational complexity
which makes it a suitable to a real time application.
Post CC Enhancements: The output of colour
constancy system is processed in HSV space to
boost contrast and brightness. CCTV videos are
generally poor in quality that even after colour
constancy procedure the actual colours of objects
appear dull and indistinguishable. The proposed
method applies modification in ‘Saturation’ and
‘Value’ components of HSV pixels as shown in
equations 1 and 2.
The original Value represented as V
o
of all pixels
is scaled in a way that the lower V
o
values have
higher scaling factor while the scaling factor for
higher values decreases gradually. Modified pixel
values have been represented using a Quadratic
Bezier Curve as shown in figure 1a. The saturation
‘S’ of all those pixels that have the values less than a
threshold t
s
is scaled up by a factor f
s
.
V
n
= α
2
P
0
+2 α P
1
+1-α
2
i
0 < (1-α) < 1
(1)
S
n
= S
o
* f
s
if S
o
< t
s
(2)
where, α = (1 - V
o
), S
o
is the Original Saturation ‘S’
of pixels, V
n
is the newly calculated Value of pixels,
S
n
is the modified Saturation of pixels and t
s
is the
Upper threshold for saturation ‘S’.
P
o
, P
1
, P
2
are the constants in Quadratic Bezier
Curve. The values, P
o
= 0, P
1
= 1 and P
2
=1 worked
best for videos present in our dataset. The quality of
video gets remarkably enhanced by this method.
Figure 2 shows the effect of these enhancements on
a video frame.
3.2 HSV Quantization
The next step after colour space conversion is the
quantization. The HSV space is quantized into 450
VISAPP2013-InternationalConferenceonComputerVisionTheoryandApplications
456