is enhanced only if a good trade-off between the
frame pruning and the speaker modeling is made. On
the one hand, pruning the irrelevant frames causes a
loss in speaker information, but it makes easier the
task of fitting the speakers’ models to data. On the
other hand, using all the frames preserves the entire
speaker information, but it makes the model estima-
tion inaccurate and more complex. This work and
the review of the literature have led us to conclude
that for TISAR systems an efficient frame pruning,
if it is combined with a suitable modeling, may speed
up significantly the recognition task without too much
compromising (even improving) the accuracy. In this
optic, frame pruning is an important approach to de-
sign real-time TISAR systems.
The majority of related works attempt to remove
some kinds of the irrelevant frames using specific cri-
teria based on the silence, the noise, the phonetic
information, or the correlation between successive
frames. The main contribution of this work consists
in applying the UBM model to prune all the irrelevant
frames at once whatever the kind.
To further our research we plan to use this find-
ing inside a verification TISAR system. Moreover,
developing efficient frame pruning techniques could
be used as a basis to propose new features (e.g super-
vectors) or models by setting more of importance to
vectors maximizing the selection criteria.
REFERENCES
Almaadeed, N., Aggoun, A., and Amira, A. (2016).
Text-independent speaker identification using vowel
formants. Journal of Signal Processing Systems,
82(3):345–356.
Beigi, H. (2011). Fundamentals of Speaker Recognition.
Springer Publishing Company, Incorporated.
Benyassine, A., Shlomot, E., Su, H. Y., Massaloux, D.,
Lamblin, C., and Petit, J. P. (1997). Itu-t recommenda-
tion g.729 annex b: A silence compression scheme for
use with g.729 optimized for v.70 digital simultaneous
voice and data applications. Comm. Mag., 35(9):64–
73.
Besacier, L. and Bonastre, J. F. (1998a). Frame pruning for
speaker recognition. In Acoustics, Speech and Signal
Processing, 1998. Proceedings of the 1998 IEEE In-
ternational Conference on, volume 2, pages 765–768
vol.2.
Besacier, L. and Bonastre, J. F. (1998b). Time and fre-
quency pruning for speaker identification. In Proceed-
ings. Fourteenth International Conference on Pat-
tern Recognition (Cat. No.98EX170), volume 2, pages
1619–1621 vol.2.
Campbell, W. M., Sturim, D. E., and Reynolds, D. A.
(2006). Support vector machines using GMM super-
vectors for speaker verification. IEEE Signal Process.
Lett., 13(5):308–311.
Dehak, N., Kenny, P., Dehak, R., Dumouchel, P., and Ouel-
let, P. (2011). Front-end factor analysis for speaker
verification. Audio, Speech, and Language Process-
ing, IEEE Transactions on, 19(4):788–798.
Eatock, J. P. and Mason, J. S. D. (1994). A quantita-
tive assessment of the relative speaker discriminating
properties of phonemes. In Proceedings of ICASSP
’94: IEEE International Conference on Acoustics,
Speech and Signal Processing, Adelaide, South Aus-
tralia, Australia, April 19-22, 1994, pages 133–136.
Khoury, E., Vesnicer, B., and Franco-Pedroso, e. a. (2013).
The 2013 speaker recognition evaluation in mobile en-
vironment. Idiap-RR Idiap-RR-32-2013, Idiap.
Kinnunen, T., Karpov, E., and Franti, P. (2006). Real-time
speaker identification and verification. IEEE Trans-
actions on Audio, Speech, and Language Processing,
14(1):277–288.
Kinnunen, T. and Li, H. (2010). An overview of text-
independent speaker recognition: From features to su-
pervectors. Speech Commun., 52(1):12–40.
Loizou, P. C. (2013). Speech Enhancement: Theory and
Practice. CRC Press, Inc., Boca Raton, FL, USA, 2nd
edition.
McCool, C., Marcel, S., Hadid, A., and et al. (2012). Bi-
modal person recognition on a mobile phone: us-
ing mobile phone data. Idiap-RR Idiap-RR-13-2012,
Idiap.
McLaughlin, J., Reynolds, D. A., and Gleason, T. P. (1999).
A study of computation speed-ups of the gmm-
ubm speaker recognition system. In EUROSPEECH.
ISCA.
Ramrez, J., Segura, J. C., Bentez, C., Torre, A. D. L., and
Rubio, A. (2004). Efficient voice activity detection al-
gorithms using long-term speech information. Speech
Communication, 42:3–4.
Reynolds, D. A., Quatieri, T. F., and Dunn, R. B. (2000).
Speaker verification using adapted gaussian mixture
models. In Digital Signal Processing, page 2000.
Tikourt, A., Rouigueb, A., and Djeddou, M. (2015). Ef-
ficient data selection criteria for speaker recognition.
In 3rd Inter. Conf. on Signal, Image, Vision and their
Applications, Guelma, Algeria.