Finally we summarize our experiments in section 6,
followed by conclusion in section 7.
2 MOTION INFORMATION IN
MPEG CODED DATA
This section begins with the description of the kind
of frames stored in the MPEG stream and how the
motion information is represented through the motion
vectors. We continue with a review of different ap-
proaches to the extraction and interpretation of this
motion information. As in this review no numerical
comparison between extraction methods was found,
we finish this section showing results of statistical
tests to know the amount and quality of data obtained.
2.1 Structure of the Mpeg Stream
The MPEG stream is composed of three types of
frames, I, P and B. The Intracoded (I) encodes the
whole image, P and B, as it is shown in figure 1, are
coded using motion-compensation prediction from a
previous P or I frame, in the case of P, and from a
previous and future frame as reference, in the case of
B.
Figure 1: Reference frames in an mpeg stream.
The motion information in an MPEG stream video
is stored in the Motion Vectors (MVs). The Mac-
roblock is the basic unit in the MPEG stream and it
is an area of 16 by 16 pixels and within this the mo-
tion vectors are stored. In a video sequence there are
usually only small movements from frame to frame
and for this reason, the macroblocks can be compared
between frames, and as it is shown in figure 2 instead
of encoding the whole macroblock, the difference be-
tween the two macroblocks is encoded. The displace-
ment between two macroblocks in different frames
gives the motion vector associated with some mac-
roblock. A vector defines a distance and a direction
and has two components: right
x and down x.
The displacement of the motion vector is from
the reference frame and generally in applications like
(Pilu, 2001), focused on estimation of camera motion,
it is valid using these magnitudes but in another ones,
like video segmentation, to improve the reliability of
the system it is necessary to know the motion from
Figure 2: Motion vectors associated with one macroblock.
one frame to the next. As it is shown in figure 3 the
problem is that although P and B frames are supposed
to carry motion information, not all their blocks do.
So, in these cases, it is not possible to obtain this data
and as we describe in section 2.2 to resolve this prob-
lem approximation values must be used.
Figure 3: Real motion information.
2.2 Mpeg Information Extraction
In this paper the authors consider two main groups of
methods for extracting motion information, those who
calculate real displacement values from one frame to
the next and another group who approximate these
values to supply the lack of information, to simplify
the extraction process or to remove the inherent noise
of motion vectors. On the other hand, as will be seen
later, these methods could use only information re-
lated to a subset of frames depending of their type.
Some examples of extraction of approximate values
are like (Venkatesh et al., 2001) that uses a normal-
ization process by multiplying MVs from P and B
frames by 3 or -3 respectively. (Kim et al., 2002) di-
vide the magnitude of the motion vector between k,
with k the number of frames displaced from the refer-
ence frame. (Ardizzone et al., 1999) do not consider
the individual values of each motion vector but de-
scribes a prototypal motion vector field by subdivid-
ing the whole image into N quadrants and characteriz-
ing each of them with a parameter who represents the
average values of the magnitude and the direction of
all the motion vectors associated to macroblocks who
belongs to each quadrant. In (Venkatesh and Ramakr-
ishnan, 2002) are described two steps to remove noise
of motion vectors (i) Motion Accumulation (ii) Se-
lection of representative motion vectors. The motion
accumulation consist on a scale of the MVs to make