more robustness and better true positive rates.
For the identification of telephone spam, a signif-
icant rate of false negatives can be accepted since the
audio data will be replayed a number of times. But
false positive identifications of telephone spam should
be avoided, even for large hash repositories.
5 CONCLUSIONS
We studied the security and privacy requirements
of audio fingerprints and analyzed the existing ap-
proaches and algorithms. There exist various pow-
erful fingerprinting frameworks which permit an effi-
cient identification of audio samples. Some work has
been done on the security of audio hashes, but open
issues remain if the hash is used for multimedia au-
thentication and watermarking. This contribution an-
alyzes the privacyissues which are relevant for speech
data, for example to identify replayed telephone data
(spam calls). The fingerprint should not leak informa-
tion on the original audio data.
By modifying well known audio fingerprinting al-
gorithms and combining them with a cryptographic
message authentication code, we defined a random-
ized audio hash which consists of a set of binary vec-
tors. We estimated the entropy of the subhash values
which is important for the security properties of the
proposed method. Furthermore, we analyzed the per-
formance in terms of robustness and discrimination
power. We showed that the hash has adequate robust-
ness, at least if the audio samples have sufficient audio
quality, and excellent discrimination capabilities. The
hash permits an efficient identification of speech sig-
nals in large databases and prevents the exposure of
audio content.
Future work will incorporate additional audio ma-
terial and extend the study of the security properties
of robust keyed hash functions.
REFERENCES
Bavarian Archive for Speech Signals (1998). Verbmobil II.
Bellare, M. (2006). New proofs for NMAC and HMAC:
Security without collision-resistance. Advances in
Cryptology-CRYPTO 2006, pages 602–619.
Bellare, M., Canetti, R., and Krawczyk, H. (1996). Key-
ing hash functions for message authentication. In
Advances in Cryptology—CRYPTO’96, pages 1–15.
Springer.
Cano, P., Batlle, E., Kalker, T., and Haitsma, J. (2002). A
Review of Algorithms for Audio Fingerprinting. In
Multimedia Signal Processing, IEEE Workshop on,
pages 169–173.
Clausen, M. and Kurth, F. (2004). A unified approach
to content-based and fault-tolerant music recognition.
IEEE Transactions on Multimedia, 6(5):717–731.
Cremer, M., Froba, B., Hellmuth, O., Herre, J., and Alla-
manche, E. (2001). AudioID: Towards Content-Based
Identification of Audio Material. In Audio Engineer-
ing Society Convention 110.
Doets, P. J. O. and Lagendijk, R. L. (2008). Distortion Esti-
mation in Compressed Music Using Only Audio Fin-
gerprints. IEEE Transactions on Audio, Speech, and
Language Processing, 16(2).
Fridrich, J. and Goljan, M. (2000). Robust Hash Functions
for Digital Watermarking. In Information Technology:
Coding and Computing, International Conference on,
pages 178–183.
Grutzek, G., Strobl, J., Mainka, B., Kurth, F., Poerschmann,
C., and Knospe, H. (2012). Perceptual hashing for the
identification of telephone speech. Speech Commu-
nication; 10. ITG Symposium; Proceedings of, pages
1–4.
Haitsma, J. and Kalker, T. (2002). A highly robust audio fin-
gerprinting system. In Proc. ISMIR, volume 2, pages
13–17.
Koval, O., Voloshynovskiy, S., Bas, P., and Cayre, F. (2009).
On security threats for robust perceptual hashing. In
IS&T/SPIE Electronic Imaging 2009.
Koval, O., Voloshynovskiy, S., Beekhof, F., and Pun, T.
(2008). Security analysis of robust perceptual hash-
ing. In IS&T/SPIE Electronic Imaging 2008.
Kurth, F. and M¨uller, M. (2008). Efficient Index-Based Au-
dio Matching. IEEE Transactions on Audio, Speech,
and Language Processing, 16(2):382–395.
Slaney, M. and Casey, M. (2008). Locality-sensitive hash-
ing for finding nearest neighbors [lecture notes]. Sig-
nal Processing Magazine, IEEE, 25(2):128–131.
Swaminathan, A., Mao, Y., and Wu, M. (2006). Robust and
Secure Image Hashing. IEEE Transactions on Infor-
mation Forensics and Security, 1(2):215–230.
Thiemert, S., Nurnberger, S., Steinebach, M., and Zmudzin-
ski, S. (2009). Security of robust audio hashes. In
Information Forensics and Security, 2009. First IEEE
International Workshop on, pages 126 –130.
Wang, A. L.-C. (2003). An Industrial-Strength Audio
Search Algorithm. ISMIR 2003, 4th Symposium Con-
ference on Music Information Retrieval, pages 7–13.
Wang, A. L.-C. and Smith III, J. O. (2008). Methods for
recognizing unknown media samples using character-
istics of known media samples.
Weng, L. and Preneel, B. (2011). A secure perceptual hash
algorithm for image content authentication. In Com-
munications and Multimedia Security, pages 108–
121.
Zmudzinski, S. and Steinebach, M. (2009). Perception-
based Authentication Watermarking for Digital Audio
Data. In IS&T/SPIE Electronic Imaging 2009.
SECRYPT2013-InternationalConferenceonSecurityandCryptography
554