AudioHarmonicityType: The AudioHarmonicity D
describes the degree of harmonicity of an audio sig-
nal.
AudioSignatureType: A structure containing a con-
densed representation as a unique content identifier
for an audio signal for the purpose of robust auto-
matic identification of audio signals (contains statis-
tical summarization of data of AudioSpectrumFlat-
nessType)
AudioWaveformType: Description of the waveform
of the audio signal.
AudioFundamentalFrequencyType: The AudioFun-
damentalFrequency D describes the fundamental
frequency of the audio signal.
DcOffsetType: The DcOffset is each channel of max-
imum AudioSegment relative to the average technol-
ogy.
InstrumentTimberType: The InstrumentTimbreType
is a set of Timbre Descriptors established in order to
describe the timbre perception among sounds be-
longing simultaneously to the Harmonic and Percus-
sive sound families
HarmonicSpectralCentroidType: The HarmonicS-
pectralCentroid is computed as the average over the
sound segment duration of the instantaneous Har-
monicSpectralCentroid within a running window.
The instantaneous HarmonicSpectralCentroid is
computed as the amplitude (linear scale) weighted
mean of the harmonic peaks of the spectrum.
HarmonicSpectralDeviationType
The HarmonicSpectralDeviation is computed as the
average over the sound segment duration of the in-
stantaneous HarmonicSpectralDeviation within a
running window. The instantaneous HarmonicSpec-
tralDeviation is computed as the spectral deviation
of log-amplitude components from a global spectral
envelope.
HarmonicSpectralSpreadType: The HarmonicSpec-
tralSpread is computed as the average over the
sound segment duration of the instantaneous Har-
monicSpectralSpread within a running window. The
instantaneous HarmonicSpectralSpread is computed
as the amplitude weighted standard deviation of the
harmonic peaks of the spectrum, normalized by the
instantaneous HarmonicSpectralCentroid.
HarmonicSpectralVariationType: The HarmonicS-
pectralVariation is defined as the mean over the
sound segment duration of the instantaneous Har-
monicSpectralVariation.
The instantaneous Harmo-
nicSpectralVariation is defined as the normalized
correlation between the amplitude of the harmonic
peaks of two adjacent frames.
LogAttackTimeType: The LogAttackTime is defined
as the logarithm (decimal base) of the time duration
between the times the signal starts to the time it
reaches its stable part
SpectralCentroidType: The SpectralCentroid is
computed as the power weighted average of the fre-
quency of the bins in the power spectrum.
TemporalCentroidType: The TemporalCentroid is
defined as the time averaged over the energy
envelope.
In our method, we selected several descriptors
which express the emotion well among these 18 low-
level descriptors. We got the retrieval result for each
emotion by the 1 to 5 descriptors. In order to reduce
the computation time, we determined the number of
descriptors showing best retrieval result for each
emotion.
2.3 Feedback using Consistency
Principle and Multi-queries Method
Most EBMR/CBMR(Emotion Based Music Re-
trieval / Content Based Music Retrieval) systems use
only physical features that can be expressed as vec-
tors (
Wold
et al., 1996; Foote, 1997). These methods
make it easy to update weights or to construct query
points. However, these methods are difficult to ex-
pect the consistent retrieval results because the com-
plex emotion is expressed by the vector feature space
extracted from the simple query music.
In this paper, we suggest the principle of consis-
tency for relevance feedback. The emotion query of
the user is made by constructing multiple-queries in
every feedback process.
Figure 3: Principle of consistency.
Fig.3 shows the principle of consistency. If the
music discriminated as suitable by the user and can-
EMOTION-BASED MUSIC RETRIEVAL USING CONSISTENCY PRINCIPLE AND MULTI-QUERY METHOD
209