Figure 3: Data hiding in digital audio scheme.
sented in Fig.4. As we can see a whole extraction
algorithm is similar to above described hiding idea.
The following subsections contain details of the algo-
rithm.
Figure 4: Data extraction from stegoaudio scheme.
4.1 Adaptive Quantization Step
Selection
The quantization step has an important role in trans-
parency and robustness in the audio steganography al-
gorithm. The larger the quantization step is, the more
robust becomes hidden information against signal at-
tacks, but more perceptible the hidden information is.
However the smaller size of quantization step will in-
fluence the robustness. If the same quantization step
is adopted for the whole host audio, probably it will
cause one or both of the problems in some parts of the
host audio. So the quantization should be selected ac-
cording to the local audio correlation, human auditory
masking and possible signal attack.
Digital audio itself always has various auditory
features. It is very important to choose these, which
represent different parameters of the digital audio and
allow to selected appropriate embedded threshold. A
single bit is hidden in a short audio block. So in this
case an audio features should be computed for audio
samples from given audio block or from its neigh-
boring if the block is very short. Too short block
might to cause distortion of the computed features
value. In addition the values should be independent
from length of the audio block for a given range. To
meet above requirements, based on reviews and expe-
rience reported in papers (Peeters, 2004; E.Schubert,
2004; S.H.Srinivasan, 2003; Wang Xiang-yang, 2004;
Changsheng Xu, 2002), we have selected audio fea-
tures such as fundamental frequency, short time
mean energy, harmonic concentration, spectral cen-
troid, harmonic energy distribution, max energy, zero-
crossing rate.
Values of the features for a single digital audio
block form a feature vector X
i
=
{
x
i1
, x
i2
, . . . , x
i9
}
which determine a point in 9-dimensional space.
Such a vector is computed for each audio block from
all training audio records. All vectors construct a data
set X. Similarity between some vectors gives a way
to find a common quantization step for close located
points. To group them we haveused the fuzzy C-mean
clustering (J. C. Bezdek, 1987) algorithm. As a result
we get the centers of groups. For each center we adopt
separate quantization step T
k
.
4.2 Fuzzy Clustering Algorithm
Let X be a data set, and x
i
denotes one sample
(i = 1, 2, . . . , N). The goal of clustering is to partition
X into K(2 ≤ K ≤ N) subsets or representatives clus-
ters, and make most similar samples be in the same
cluster if possible. The typical clustering algorithm
gain is to strictly classify each sample to a certain
cluster. However, as a matter of the fact, there is often
no explicit characteristic with which the sample can
be grouped. Many samples belong to several clus-
ters. Under that situation, the fuzzy clustering algo-
rithm could provide a better performance. The fuzzy
clustering algorithm divides data set X into K clusters
according to fuzzy membership u
ik
(0 ≤ u
ik
≤ 1) that
represents the degree by which the sample x
i
belongs
to k − th (k = 1, 2. . . , K) cluster. So the results of the
clustering can be described as a N × K matrix, that is
composed by each u
ik
, and
K
∑
k=1
u
ik
= 1; 0 <
N
∑
i=1
u
ik
< N. (1)
Proposed by J.C.Bezdek (J. C. Bezdek, 1987), the
fuzzy C-mean clustering algorithm seeks to find fuzzy
division by minimizing the following objective func-
tion:
J
q
(U,V) =
k
∑
k=1
N
∑
i=1
(u
ik
)
q
d
2
(X
i
−V
k
), (2)
where:
N - the number of feature vectors,
K - the number of clusters (partitions),
q - weighting exponent (fuzzifier; q > 1),
u
ik
- the i-th membership function on the k-th cluster,
V
k
- the center of k-th cluster,
X
j
- the i-th feature vector,
SIGMAP 2007 - International Conference on Signal Processing and Multimedia Applications
464