ing sounds. Especially, the formant-like spectral
peaks have been focused on for the purpose of clas-
sifying OSA patients and simple snorers (Emoto T.,
2010)(Ng A.K., 2008).
According to this, these conventional studies have
used some linear analysis methods such as FFT and
LPC, but it is quite natural to consider that snoring
is derived from a nonlinear dynamics. Beck, et al,
(Beck R., 1995) insisted that the complex-waveform
snores result from the oscillation of oropharyngeal
soft tissues with colliding of the airway wall. More-
over, it is also found that the waveforms are changing
gradually or suddenly as time passes. Such non-linear
and non-stationary dynamics are generally found in
every snoring sound, but these properties have not yet
been analyzed in more detail.
On the other hand, HHT has also been applied
to the airway pressure signals related to OSA (Salis-
bury J. I., 2007), (Caseiro P., 2010). In these studies,
the histogram of HHT spectra in a specific frequency
range is calculated for 300 seconds and used to dis-
criminate OSA from non-OSA persons. These meth-
ods are valuable, but in some points, different from
our point of view: 1. These studies did not focus on
the nonstationary properties because the time struc-
ture is ignored by calculating the histogram of HHT
spectra. One of our hypothesis is that some useful in-
formation about OSA would also be involved in the
time structure. This has not been verified in conven-
tional studies. 2. The data analyzed in these papers
are the airway pressure signals obtained from nasal
breath (Salisbury J. I., 2007) and oronasal breath (Ca-
seiro P., 2010). In contrast, we focused in this paper
on the nasal snoring sound.
3 METHOD
3.1 Subjects and Instrument
A portable linear PCM (Pulse Code Modulation)
sound recorder, Olympus LS-10, is used to record
snoring sounds. Sampling frequency and quantiza-
tion rate are set to 44.1 kHz and 16 bit respectively. A
snoring sound analyzed in this paper (shown in figure
1) is recorded from a male healthy man.
The subject is asked to simulate nasal snoring by
breathing deeply enough to oscillate the soft palate
in his throat. While producing snores, the subject’s
mouth is completely closed. Such snoring, called sim-
ulated snoring in common, is not always equivalent to
the one generated during sleep, but it has traditionally
been adopted in some medical studies.
3.2 Hilbert-Huang Transform (HHT)
The Hilbert-Huang transform (HHT), which consists
of an empirical mode decomposition (EMD) followed
by the Hilbert spectral analysis, was developed re-
cently by Huang, et al (Huang N.E., 1998). It presents
a fundamentally new approach to the analysis of time
series data. Its essential feature is the use of an adap-
tive time-frequency decomposition that does not im-
pose a fixed basis set on the data, and therefore, unlike
Fourier or Wavelet analysis, its application is not lim-
ited by the time-frequency uncertainty relation. This
leads to a highly efficient tool for the investigation of
transient and nonlinear features.
The Hilbert transform of a function h(t) is defined
by
v(t) =
1
π
P
Z
∞
−∞
h(τ)
t − τ
dτ = h(t) ∗
1
πt
, (1)
where P and ∗ denote the Cauchy principal value
of the singular integral and the convolution, respec-
tively. By the theory of the Poisson integral, F(t) =
h(t) + iv(t) is the boundary value of a holomorphic
function F(z) = F(t + iv) = a
HT
(t)e
iθ(t)
in the up-
per half-plane, if h(t) ∈ L
p
(the Lebesgue space for
1 < p < ∞). Then the instantaneous amplitude (IA)
a
HT
(t) and the instantaneous frequency (IF) f
HT
(t) is,
respectively, defined by
a
HT
(t) =
q
h(t)
2
+ v(t)
2
, (2)
and
f
HT
(t) =
1
2π
dθ(t)
dt
, where θ(t) = tan
−1
v(t)
h(t)
.
(3)
However, for h(t) /∈ L
p
, the IF obtained using the
above method is not necessarily physically meaning-
ful. For example, h(t) = cosωt + C, where C and ω
are constants, does not yield a constant frequency of
ω. To explore the applicability of the Hilbert trans-
form, Huang, et al, (Huang N.E., 1998) showed that
the necessary conditions to define a meaningful IF
are that the functions are symmetric with respect to
the local zero mean and have the same numbers of
zero crossings and extrema. Thus they applied the
empirical mode decomposition (EMD) to the original
data h(t) to decompose it into intrinsic mode func-
tions (IMFs) and the residual. Each IMF satisfies the
following conditions: (1) in the whole data set, the
number of extrema and the number of zero crossings
must either equal or differ at most by one; and (2) at
any point, the mean value of the envelope defined by
the local maxima and the envelope defined by the lo-
cal minima is zero. The EMD is a series of high-pass
DetectingNonlinearAcousticPropertiesofSnoringSoundsusingHilbert-HuangTransform
307