A VIDEO CLASSIFICATION METHOD
FOR USER-CENTERED STREAMING SERVICES
Yuka Kato
Advanced Institute of Industrial Technology
1-10-40 Higashiohi, Shinagawa-ku, Tokyo 1400011, Japan
Katsuya Hakozaki
The University of Electro-Communicaions
1-5-1 Chofugaoka, Chofu-shi, Tokyo 1828585, Japan
Keywords:
Multimedia communications, user-centered streaming services, user perceptive video quality, video content,
subjective assessment, rate control.
Abstract:
The present paper analyzes the relationship between video content and subjective video quality for user-
centered streaming services. In this analysis, we conduct subjective assessments using various types of video
programs, and propose a method of classifying video programs into a number of groups that are thought by a
large majority of users to have the same video quality. Control to a high level of user satisfaction can be per-
formed by applying a different control method to each group obtained using the proposed method. In addition,
we demonstrate the necessity of rate control according to video content by comparing a classification result
based on vision parameters with the classification result based on the assessment result.
1 INTRODUCTION
With the recent increased availability of broadband
Internet access, such as xDSL and FTTH services,
streaming video services on the Internet have be-
come more widely used. In order to provide such
services with high-quality streaming videos, adaptive
rate control of the network environment is needed be-
cause system users access streaming video content
from various system environments and because net-
work resources on which video data are transmitted
are shared among numerous users. Therefore, we
should consider video content and user preference of
video quality, because user perceptive video quality
differs greatly with these factors. Recently, several
studies have examined rate control methods with ref-
erence to video content. These studies can be classi-
fied into the following two types:
(1) Studies using the meta-data of streaming videos
In these studies, content creators, producers or dis-
tributors give some descriptions of scene importance
as meta-data beforehand, and a rate controls are based
on these meta-data. For example, a rate control
method using semantic scene importance, which a
content distributor determines according to his/her
subjective judgment, has been proposed(Ohta et al.,
1997). MPEG-7(ISO/IEC, 2002) can be used to de-
scribe meta-data for these methods.
(2) Studies using the results of feature extraction
In these studies, streaming systems automatically ex-
tract video features, such as scene changes and the
degree of motion complication, and perform suitable
rate control for the features. For example, rate adap-
tation transcoding for H.263+ videos has been pro-
posed(Lei and Georganas, 2002).
In type (1) methods, content distributors can spec-
ify the importance of video scenes, and fine-grained
quality control with high accuracy can be achieved.
User subjective video quality is thereby improved.
However, these methods have a problem in that they
are difficult to apply to streaming services on the In-
ternet because assigning meta-data to video programs
is a time-consuming task and because it is impos-
sible to assign meta-data to live programs. More-
over, the meta-data are based on subjective judg-
ments, which are not always consistent with univer-
sal user perceptive video quality according to video
content. On the other hand, the type (2) methods
can overcome such difficulties of meta-data descrip-
tion, because streaming systems automatically extract
video features. However, the target features of these
methods do not express the semantic scene impor-
tance, but rather visual features, such as the degree of
motion intensity. Semantic importance for the same
scene taken from a football game may differ between
a live sports program and a sports news program.
200
Kato Y. and Hakozaki K. (2006).
A VIDEO CLASSIFICATION METHOD FOR USER-CENTERED STREAMING SERVICES.
In Proceedings of the International Conference on Signal Processing and Multimedia Applications, pages 200-207
DOI: 10.5220/0001569402000207
Copyright
c
SciTePress
Therefore, in the present paper, we analyze the rela-
tionship between video content and subjective video
quality for use-centered streaming services. In this
analysis, we conduct subjective assessments by us-
ing various types of video programs and propose a
method of classifying video programs into a number
of groups judged by a large majority of users to have
the same video quality. Control with a high level of
user satisfaction can be achieved by applying different
control methods to each group obtained by using the
proposed method. We also demonstrate the necessity
of rate control according to video content by compar-
ing a classification result based on vision parameters
with that based on the assessment result.
2 PREPARATION
2.1 Feature Description Parameters
In the present paper, we classify video programs that
have various features according to the user’s percep-
tion of video quality. Here, we describe the video fea-
tures as parameters and classify a number of video
programs based on these parameters. We therefore
determine the feature description parameters in this
section. Since our goal is rate control with describe
video content, as well as the vision parameters, which
generally describe video features. We use the follow-
ing parameters in the present paper:
Vision parameters: average bit rate, motion inten-
sity and number of objects.
Semantic parameters: genre, importance of video,
importance of audio, importance of information
and entertainment factors.
2.2 Quality Parameters
Next, we define the quality parameters, which are
control targets in the present paper because the same
control method is applied to each group. In gen-
eral, rate control of streaming services is performed
by changing the spatial information content (bit rate
per frame), by changing the time information content
(frame rate), or both. We therefore define three types
of quality parameter: image quality, frame size (spa-
tial information content), and frame rate (time infor-
mation content). These are defined as user perceptive
qualities that do not depend on the streaming codec
type. There is a trade-off between spatial content and
time content if we reduce the information of a suitable
reduction is required for maximizing the user subjec-
tive video quality.
Considering actual coding (or transcoding), more
detailed allocation is needed. Several studies have
Table 1: Relationship between sensation and emotion.
Burn Smell Auditory Vision
Affection 10 8 4 2
·· ·· ·· ··
Cognition
2 4 8 10
examined such allocation methods. For example,
a scene adaptive bit-rate control method in MPEG
video coding(Lee et al., 1997) (reduction of spatial
information content), a method of dynamically drop-
ping frames according to the system environment for
MPEG-1 streaming systems(Cha et al., 2003) (reduc-
tion of time information content), and a bitrate control
method based on macro-block quantization parameter
assignment in MPEG videos(Keesman et al., 1995)
(reduction of spatial information content) have been
proposed. For the actual coding process, we need to
map these parameters. In the present paper, however,
we do not consider the mapping method because map-
ping depends on the corder implementation.
2.3 User Preferences Regarding
Video Quality
Next, we define quantitative user preferences for
video quality. All senses, including vision (sight), are
generally with some emotion (e.g., comfort of dis-
comfort, like or dislike). For example, the relation-
ship between sensation and emotion is proposed as
shown in Table 1(Pieron, 1950). This table indicates
that higher sensations, such as vision, have a more
intellectual aspect and a higher degree of objectivity,
but also a certain degree of subjectivity (emotional as-
pects). Thus, for vision, a high level of user satisfac-
tion can be obtained by targeting common preferences
among many users while at the same time considering
individual preferences.
From this viewpoint, we define user preference
of video quality using the three quality parameters.
These parameters are combined to define a user pref-
erence vector p(u
i
, v
i
) = (w
q
, w
s
, w
f
), where w
q
, w
s
and w
f
are the importances of image quality, the im-
portance of frame size and the importance of frame
rate respectively, and u
i
is a user ID, and v
i
is a con-
tent ID. Each element of the vector is the weight of
each quality parameter, i.e., w
q
+ w
s
+ w
f
= 1.
3 RELATIONSHIP BETWEEN
VIDEO CONTENT AND
SUBJECTIVE VIDEO QUALITY
In this section, we use the quality parameters deter-
mined in the previous section to perform subjective
assessment by paired comparison(Guilford, 1954),
A VIDEO CLASSIFICATION METHOD FOR USER-CENTERED STREAMING SERVICES
201
which is a psychometric method. Based on the results,
we then conduct a cluster analysis of sample videos
and classify the videos into a number of groups. In the
present paper, our goal is to clarify the effects of the
parameters on the subjective video quality for each
of the respective groups. We therefore analyze these
main factors. Here, we define the subjective video
quality as the level of user satisfaction with respect to
video quality.
3.1 Subjective Assessment
3.1.1 Experimental Methods
In this section, we investigate the degree of the ef-
fects of the quality parameters on subjective video
quality. Several evaluation methods of user satisfac-
tion have been proposed, including a subjective qual-
ity estimation method using neural networks(Lin and
Mersereau, 1999). However, the evaluation accuracy
of audio and video quality is low because determina-
tion of the absolute degree of audio or video quality
is difficult. In order to obtain reliable results for such
items, a method of paired comparisons can be used.
Using this method, test subjects can easily evaluate
video quality. Therefore, the present paper applies the
method for subjective assessment.
Methods of paired comparisons can be classi-
fied into two types. The first type, which includes
Beadley’s method and Thurstone’s method, expresses
the comparison results by ranking. The second type is
Scheffe’s method, in which the results are expressed
as scores. In this present paper, we express the com-
parison results as scores and analyze the results by
assuming various structures of these scores, i.e. we
adopt Scheffe’s method. In addition, we use Nakaya’s
modified method(JUSE, 1973), in which each test
subject is assigned all combinations of trials.
The experimental procedure is described in detail
below. We provided 30 sample videos having vari-
ous program features, i.e., various themes, styles and
content, for example. A list of these sample videos
is shown in Table 2. We selected the sample videos
to reflect various viewing purposes in order to ana-
lyze the relationship between the video content and
subjective video quality. The video features of these
sample videos are described by the feature descrip-
tion parameters defined in Section 2 These parameter
values are also listed in Table 2. Motion intensity is
represented on a scale from 1 to 5 (in which larger
numbers indicate more intensive motion). The num-
ber of objects is represented on a scale from 1 to 3
(in which larger numbers indicate a greater number
of objects). Finally, the importance of semantic para-
meters is represented on a scale from 1 to 3 (in which
larger numbers indicate higher importance).
Table 3: Quality parameter values for the experiment.
Quality Rate Size
Original 100 30 fps 640×480
Slight degradation
27 10 fps 384×256
Medium degradation
15 3 fps 192×128
Severe degradation
10 1 fps 96×94
Next, we prepared 10 test videos for the exper-
iment by changing the three quality parameters in
three grades. In general, degradation appearances of
digital video quality by the same factor (e.g. packet
loss) differ among different codec types and it is nec-
essary to consider these effects on subjective evalua-
tion values. In the present paper, in order to eliminate
these effects, we prepared the video under the con-
dition that sufficient system resources exist. Quality
parameter values for the experiment are listed in Ta-
ble 3. In the table, video quality is denoted as image
degradation ratios.
The number of test video is 10 for each sample
program. In this experiment, we made pairs of test
videos for each program and had test subjects evaluate
which video was preferable and the degree of prefer-
ability with respect to subjective video quality. The
subjects rated 45 pairs, such as a video with slight
degradation of frame size and a video with a medium
degradation of image quality, on a scale raging from
3 to 3 (i.e. seven-grade) for each program (See Ta-
ble 4). Lower grades denote lower subjective evalua-
tion values. Each test video was made in advance and
was watched by the subjects for approximately twenty
seconds on a 19-inch LCD monitor connected to a PC.
We determined the experimental conditions based on
ITU-R Recommendation BT.500(ITU-R, 2000). The
distance between the test subject and the monitor was
60 cm, and the number of light sources in the experi-
mental room was one. The measured intensity of illu-
mination was 20 lux at the viewing point. The experi-
mental procedure for comparing video streams, which
consisted of watching and assessing streams, is shown
in Fig. 1. In this figure, A and B compared video
streams, and gray images are displayed for one sec-
ond between A and B and for three seconds at the end
of the sequence.
In this case, we did not consider the comparison or-
der and asked all subjects to compare ordered pairs
once, because spatial comparisons, for example of
colors and shapes, were performed in this experiment,
and afterimages may not have existed. The number of
subjects was six (five male and one female), and 9,450
sets of data were obtained.
3.1.2 Experimental Results
This evaluation results can be summarized in a table
having 10 degradation parameters as the vertical and
horizontal rows, and the grades as its elements. An
SIGMAP 2006 - INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND MULTIMEDIA
APPLICATIONS
202
Table 2: Sample programs for the experiment and feature description parameters.
No. Genre (content) Ave. Motion Number Importance Importance Importance Entertainment
bps intensity of objects of video of audio of info. factors
1 News (accident) 985 1 1 1 2 3 1
2
News (sports) 1245 4 3 2 2 3 1
3
News (weather) 1014 1 2 1 2 3 1
4
News (general) 1110 1 2 2 2 3 1
5
Documentary (frescos) 1420 3 3 2 2 3 2
6
Vaudeville (vaudeville) 1309 3 2 1 2 1 3
7
Vaudeville (comedy) 1397 4 2 2 2 1 3
8
Vaudeville (quiz) 1283 2 1 1 2 1 3
9
Vaudeville (talk) 975 3 3 2 3 1 3
10
Drama (comedy) 1087 3 2 2 2 1 3
11
Drama (medical) 942 2 2 2 2 1 3
12
Drama (foreign) 1020 3 2 2 2 2 3
13
Drama (period drama) 1373 5 1 2 2 1 3
14
Animation (action) 1063 4 1 3 1 1 3
15
Animation (general) 1071 2 1 2 2 1 3
16
Movie (action) 987 4 1 3 2 1 3
17
Movie (war) 1181 5 1 3 2 1 3
18
Movie (SF) 855 1 2 3 2 1 3
19
Movie (musical) 779 3 1 3 3 1 3
20
Sports (baseball) 1253 4 2 3 1 1 3
21
Sports (marathon) 1331 5 1 2 2 1 3
22
Sports (boxing) 1320 4 3 3 1 1 3
23
Sports (rally) 1300 5 1 3 1 1 3
24
Music (pop music) 1147 2 3 2 3 1 3
25
Music (classic) 962 3 2 1 3 1 2
26
Music (traditional) 1124 2 2 1 3 1 3
27
Music (video clip) 1399 5 3 3 3 1 3
28
Hobby (cooking) 1179 3 2 2 2 3 2
29
Information (health) 1179 2 2 2 2 3 2
30
Education (English) 949 2 2 2 3 3 1
Table 4: Quality rating for the experiment.
Grade Quality
3 Excellent
2
Very good
1
Good
0
Fair
-1
Poor
-2
Bad
-3
Extremely bad
example of the evaluation values for Program 1 for
one test subject in shown in Table 5.
In this case, for example, the grade of (S2, Q1) is 3.
This means that the video of Q1 is three grades bet-
ter than that of S2. Thus, by using the total amount
of the Q1 line, we can numerically express how much
better, with respect to total grade, the video of Q1 is
compared to other videos. In this example, the total
grade of the video of Q1 is 9. In the same way, we
calculated the grade of each parameter for each pro-
gram. In the present paper, we define a subjective
quality vector for each program as the average of the
grades of all test subjects. This vector is represented
by the quality parameters defined in Section 2.
Figure 1: Experimental procedure.
3.2 Cluster Analysis
In this section, we calculate the Euclidean distances
in ten-dimensional space between the experimentally
obtained data and conduct cluster analysis. Cluster
analysis is a method of collecting similar items from a
sample set according to similarities among the items
and classifying the items into a number of homoge-
neous clusters. These methods are roughly classi-
fied into two types: hierarchical clustering, which ob-
tains a dendrogram as a result, and non-hierarchical
clustering, which divides the sample data into a pre-
determined number of groups. In this analysis, we
adopted hierarchical clustering because the number of
groups was not determined. In addition, we adopted
the Ward method to obtain manageable clusters. Fig-
ure 2 shows a dedrogram obtained by cluster analysis.
A VIDEO CLASSIFICATION METHOD FOR USER-CENTERED STREAMING SERVICES
203
Table 5: An example of the experimental result (Program 1).
O Q1 Q2 Q3 S1 S2 S3 F1 F2 F3
O 0 0 -1 -2 -3 -3 0 0 0
Q1
0 0 -1 -2 -3 -3 0 0 0
Q2
0 0 0 -2 -3 -3 0 0 0
Q3
1 1 0 -1 -3 -3 1 1 1
S1
2 2 2 1 -2 -3 2 2 2
S2
3 3 3 3 2 -1 3 3 3
S3
3 3 3 3 3 1 3 3 3
F1
0 0 0 -1 -2 -3 -3 0 0
F2
0 0 0 -1 -2 -3 -3 0 0
F3
0 0 0 -1 -2 -3 -3 0 0
Total 9 9 8 2 -8 -22 -25 9 9 9
O: Original
Q1: Slight degradation of image quality, Q2: Medium, Q3: Severe
S1: Slight degradation of frame size, S2: Medium, S3: Severe
F1: Slight degradation of frame rate, F2: Medium, F3: Severe
Figure 2: Dendrogram obtained by cluster analysis.
As the number of cluster increases, a detailed
analysis can be carried out, but it becomes difficult
to pick out factors representing each cluster because
the number of elements in a cluster is small. In the
present paper, we divided the programs into groups at
the dotted line shown in Fig. 2, and constructed four
groups, Group A through D, because the number of
quality parameters, which are the control targets, is
three and the suitable number of groups is from three
to five. The features of each group are described as
follows:
Group A contains News (accident) and News
(weather). In these programs, the importance of
information is high and the entertainment factor is
small. Thus, the importance of video is not so high.
Group B contains News (general), Documentary
(fresco), Vaudeville (vaudeville), Vaudeville (quiz),
Hobby (cooking), Information (health) and Educa-
tion (English). With the exception of the vaudeville
programs, these programs are watched in order to
gather information, and so the importance of infor-
mation is high for these programs. The quality lev-
els of video and audio must be sufficient so as not
to interfere with information gathering. Vaudeville
programs with little motion and with small num-
bers of objects are contained in the group. This
is because the entertainment factors of these pro-
grams are high, but neither the importance of video
nor the importance of audio is high.
Group C contains News (sports), Vaudeville (com-
edy), Drama (period drama), Movie (war), Movie
(musical), Sports (baseball), Sports (marathon),
Sports (boxing), Sports (rally) and Music (Clas-
sic). In these programs, the entertainment factor
and the importance of video are high. Programs
having high motion intensities, including all sports
SIGMAP 2006 - INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND MULTIMEDIA
APPLICATIONS
204
Figure 3: Classification process.
relate programs, are included in this group.
Group D includes Vaudeville (talk), Drama (com-
edy), Drama (medical), Drama (foreign), Anima-
tion (action), Animation (general), Movie (action),
Movie (SF), Music (pop music) and Music (video
clip). The entertainment factors of these programs
are high, but the importance of video is low com-
pared to the programs in Group C. Music programs,
for which the importance of video is relatively high,
are included in this group. Although there are
some exceptions, general dramas and video pro-
grams with low importance of motion are contained
in the group.
Using these results, we designed a method which
to classify the video programs into the above four
groups. Four factors, which are the importance of in-
formation, the importance of video, the importance of
audio and motion intensity, were used. The classifica-
tion process is shown in Fig. 3. All programs except
Animation (action), Movie (action), Movie (musical)
and Music (classic) can be appropriately classified by
this rule. In this case, we assumed that video fea-
tures are given to each program as described in Table
2 in advance. However, these features can be given
automatically to a certain extent using program gen-
res. For example, news programs may have high im-
portance of information and low entertainment fac-
tors, and movie programs may have high entertain-
ment factors. Although there are some programs that
need to be considered individually, such as sports
news, feature description according to program gen-
res makes automatic classification possible.
3.3 Relationship between Content
and Subjective Video Quality
In order to apply the results of the previous section
to actual quality control, in this section, we investi-
gate the quantitative relationship between video con-
tent and subjective video quality.
First, we normalized the quality parameters used
in the experiment, as shown in Table 6, for multiple
linear regression analysis. The results of the multi-
ple linear regression analysis are described below. In
these equations, y, x
1
, x
2
and x
3
represent the subjec-
tive quality value (which expresses the degree of user
satisfaction of video quality), the normalized value of
image quality, the normalized value of frame rate and
the normalize value of frame size, respectively. The
coefficient of determination (r
2
) and F-statistic (F )
are also shown.
Group A: y = 0.02x
1
2
+ 2.5x
1
0.04x
3
2
+ 6.1x
3
248.3 (r
2
: 0.965, F : 4.8, 4.6, , , 45.3, 85.2)
Group B: y = 0.02x
1
2
+ 2.4x
1
0.03x
2
2
+ 4.3x
2
0.05x
3
2
+ 7.1x
3
372.3
(r
2
: 0.916, F : 6.0, 6.3, 40.7, 64.9, 100.6, 76.2)
Group C: y = 0.02x
1
2
+ 2.7x
1
0.04x
2
2
+ 6.6x
2
0.04x
3
2
+ 6.1x
3
474.2
(r
2
: 0.928, F : 9.4, 10.9, 104.7, 204.6, 94.7, 172.5)
Group D: y = 0.04x
1
2
+ 4.5x
1
0.05x
2
2
+ 7.5x
2
0.04x
3
2
+ 6.1x
3
468.4
(r
2
: 0.920, F : 28.8, 31.4, 171.3, 269.4, 100.5, 178.7)
Since all of the coefficients of determination are
greater than 0.9, these estimations are highly accu-
rate. These F-statistics indicate the degree to which
the quality parameters contribute to y estimation, and
a large value of F indicates a large contribution (more
precisely, F is the value for the test of significance
in adding the parameter to the explanatory variables).
Under practical streaming environments, there are
various constraints of quality control, such as limi-
tation of available quality parameters. In such cases,
efficient quality control can be achieved by selecting
parameter values that maximize the subjective quality
value y under the constraints.
4 APPLICATION TO RATE
CONTROL METHODS
In this section, we apply the analysis results obtained
in the previous section to an actual encoding system
and demonstrate the implementation process of the
proposed method on practical streaming servers. This
encoding system is TMPGEnc(TMPGEnc, ), which
is a software encoder. We provided MPEG-1 videos
by using TMPGEnc for subjective assessment. TMP-
GEnc enables us to set various quality parameters in-
cluding image quality, frame rate and frame size to
target videos so that we can obtain encoded videos
with various video qualities. At this time, the rela-
tionship between these quality parameters and the bi-
trate of encoded videos can be roughly expressed by
A VIDEO CLASSIFICATION METHOD FOR USER-CENTERED STREAMING SERVICES
205
Table 6: Normalization of the quality parameters.
Measured Normalized
Image quality Frame rate Frame size Image quality Frame rate Frame size
Original 100 30 fps 640×480 100 100 100
Slight degradation
27 10 fps 384×256 27 33 32
Medium degradation
15 3 fps 192×128 15 10 8
Severe degradation
10 1 fps 96× 94 10 3 3
Table 7: Numerical examples.
Programs kbps y x
1
x
2
x
3
News (accident) 500 62 58 59 77
Vaudeville (vaudeville) 500
105 54 70 70
Sports (boxing) 500
112 48 79 70
Movie (action) 500
170 51 72 72
News (accident) 300 62 56 29 76
Vaudeville (vaudeville) 300
86 31 64 66
Sports (boxing) 300
82 26 75 68
Movie (action) 300
136 30 67 65
News (accident) 100 47 36 10 70
Vaudeville (vaudeville) 100
32 10 48 59
Sports (boxing) 100
3 10 61 47
Movie (action) 100
43 10 57 50
the following equation:
B = 0.0766 × x
1
0.732
× x
2
0.677
× x
3
0.703
where x
1
, x
2
and x
3
are normalized quality parame-
ters, as indicated in Table 6, and B is the bitrate
(kbps). By using this equation and the relationship
between subjective video quality and the quality pa-
rameter values obtained in the previous section, we
can calculate a set of parameter values x
1
, x
2
, x
3
for
the maximum subjective quality value under the con-
dition of fixed bitrate B.
Numerical examples are shown in Table 7. Target
videos are News (accident) in Group A, Vaudeville
(vaudeville) in Group B, Sports (boxing) in Group C
and Movie (action) in Group D. Here, we show sets
of quality parameter values and subjective quality val-
ues, where the bitrates are 500 kbps, 300 kbps and 100
kbps. Here, x
1
, x
2
and x
3
are also normalized values.
These results indicate that optimal parameter sets are
different in each group, especially in the case of low
bitrates. This influence is large for tasks such as video
transmission and storing videos on disks.
In actuality, there may be few cases in which all
of the quality parameters can be changed without any
restriction, e.g., frame size is fixed on a mobile ter-
minal. Even in such cases, the same procedure can
be used for obtaining a set of the parameters using
some fixed values. Here, let us consider the case in
which the frame size is fixed as x
3
= 33, i.e., the
frame size is 384 × 256. In Group A, y = 23.0 at
both (x
1
, x
2
, x
3
) = (82, 100, 33) and (x
1
, x
2
, x
3
) =
(45, 18, 33), but the bitrate is 500 kbps in the former
case and 100 kbps in the latter case. In this case, we
can reduce the bitrate by 20 % by changing the set
of quality parameter values. As another example, in
Group B, y = 0.55 and y = 2.46 at (x
1
, x
2
, x
3
) =
(88, 93, 33) and (x
1
, x
2
, x
3
) = (82, 47, 33), respec-
tively, but the bitrate is 500 kbps in the former case
and 300 kbps in the latter case. In this case, we can
reduce the bitrate by 60 %.
5 DISCUSSION
In this section, we compare a classification result ob-
tained based on only vision parameters (not according
to video content) and a classification result obtained
based on the assessment result (according to video
content), as mentioned in the previous section. We
conducted cluster analysis for the same 30 programs
used in the previous section by using the three types of
vision parameters (average bps, motion intensity and
number of objects) listed in Table 2. Figure 4 shows a
dendrogram obtained by this analysis. We divided the
programs into groups at the dotted line shown in the
figure and constructed the following four groups:
Group 1: News (sports), Documentary (fresco),
Vaudeville (vaudeville, comedy, quiz), Drama (pe-
riod drama), Sports (baseball, marathon, boxing,
rally) and Music (video clip)
Group 2: News (accident, weather), Vaudeville
(talk), Drama (medical, foreign), Movie (action),
Music (classic) and Education (English)
Group 3: Movie (SF, musical)
Group 4: News (general), Drama (comedy), Ani-
mation (action, general), Movie (war), Music (pop
music, traditional), Hobby (cooking) and Informa-
tion (health)
There are a number of common features shared be-
tween the group described in Section 3 and described
in Section 5 For example, both the programs in Group
1 and those in Group C appear to have a feature of in-
tensive motion. However, for the most part, the clas-
sification result based on the subjective assessment is
very different from that based on only vision para-
meters. As a result, we found that subjective video
SIGMAP 2006 - INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND MULTIMEDIA
APPLICATIONS
206
Figure 4: Dendrogram obtained by cluster analysis.
quality is affected by not only vision parameters, but
also semantic parameters.
Moreover, an important point to note is that vision
parameter values change with time, whereas seman-
tic parameter values do not change within a program.
Streaming services are continuous media, and it is
generally difficult to perform rate control following
such vision parameter changes. By using semantic
parameters, which do not change within a program,
for control, subjective video quality may be improved
and more effective control might be achieved.
6 CONCLUSION
In the present paper, we analyzed the relationship be-
tween video content and subjective video quality for
user-centered streaming services. In this analysis, we
conducted subjective assessment using various kinds
of video programs and clarified common perceptive
video quality for several users. Based on the results,
we proposed a method of classifying video programs
into a number of groups judged by a large majority of
users to have the same video quality. We also showed
the necessity for rate control according to video con-
tent by comparing a classification result based on vi-
sion parameters with a classification result based on
the assessment result. In the future, we will conduct a
subjective assessment on a large number of test sub-
jects (from 20 to 30) and will determine the values of
feature description parameters as objective data.
ACKNOWLEDGEMENTS
This work was supported in part by the Grant-in-Aid
for Scientific Research (No. 15700034) of the Japan
Society for the Promotion of Science.
REFERENCES
Cha, H., Oh, J., and Ha, R. (2003). Dynamic frame drop-
ping for bandwidth control in mpeg streaming system.
In Multimedia Tools and Applications 19. Springer
Science.
Guilford, J. (1954). Psychometric methods. McGraw-Hill.
ISO/IEC (2002). MPEG-7: Multimedia Content Descrip-
tion Interface. ISO/IEC 15983.
ITU-R (2000). Methodology for the Subjective Assessment
of the Quality of Television Pictures. ITU-R BT.500-
10.
JUSE (1973). Sensory Evaluation Handbook. JUSE Press.
Keesman, G., Shah, I., and Klein-Gunnewiek, R. (1995).
Bitrate control for mpeg encoders. In Image Commu-
niocations 6. Signal Processing.
Lee, M., Kwon, S., and Kim, J. (1997). A scene adaptive
bitrate control method in mpeg streaming system. In
VCIP. SPIE 3024.
Lei, Z. and Georganas, N. (2002). Rate adaptation tran-
coding for precoded video streams. In ACM Multime-
dia’02. ACM.
Lin, F. and Mersereau, R. (1999). Rate-quality tradeoff
mpeg video encoder. In Image Communiocations 14.
Signal Processing.
Ohta, K., Watanabe, T., and Mizuno, T. (1997). Selective
multimedia access protocol for wireless multimedia
communmication. In IEEE PACRIM’97. IEEE.
Pieron, H. (1950). Feeling and emotion. McGraw-Hill.
TMPGEnc. http://www.tmpgenc.net/.
A VIDEO CLASSIFICATION METHOD FOR USER-CENTERED STREAMING SERVICES
207