loading
Documents

Research.Publish.Connect.

Paper

Authors: Arafat Abu Mallouh ; Zakariya Qawaqneh and Buket D. Barkana

Affiliation: University of Bridgeport, United States

ISBN: 978-989-758-212-7

Keyword(s): Deep Neural Network, GMM-UBM, I-Vector, Speaker Age and Gender Classification, Fine-Tuning.

Related Ontology Subjects/Areas/Topics: Acoustic Signal Processing ; Artificial Intelligence ; Biomedical Engineering ; Biomedical Signal Processing ; Computational Intelligence ; Data Manipulation ; Health Engineering and Technology Applications ; Human-Computer Interaction ; Methodologies and Methods ; Multimedia ; Multimedia Signal Processing ; Neural Networks ; Neurocomputing ; Neurotechnology, Electronics and Informatics ; Pattern Recognition ; Physiological Computing Systems ; Sensor Networks ; Signal Processing ; Soft Computing ; Speech Recognition ; Telecommunications ; Theory and Methods

Abstract: Speakers’ age and gender classification is one of the most challenging problems in the field of speech processing. Recently, remarkable developments have been achieved in the neural network field, nowadays, deep neural network (DNN) is considered one of the state-of-art classifiers which have been successful in many speech applications. Motivated by DNN success, we jointly fine-tune two different DNNs to classify the speaker’s age and gender. The first DNN is trained to classify the speaker gender, while the second DNN is trained to classify the age of the speaker. Then, the two pre-trained DNNs are reused to tune a third DNN (AGender-Tuning) which can classify the age and gender of the speaker together. The results show an improvement in term of accuracy for the proposed work compared with the I-Vector and the GMM-UBM as baseline systems. Also, the performance of the proposed work is compared with other published works on a publicly available database.

PDF ImageFull Text

Download
CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 34.204.169.76

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Abu Mallouh, A.; Qawaqneh, Z. and Barkana, B. (2017). Combining Two Different DNN Architectures for Classifying Speaker’s Age and Gender.In Proceedings of the 10th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 4: BIOSIGNALS, (BIOSTEC 2017) ISBN 978-989-758-212-7, pages 112-117. DOI: 10.5220/0006096501120117

@conference{biosignals17,
author={Arafat Abu Mallouh. and Zakariya Qawaqneh. and Buket D. Barkana.},
title={Combining Two Different DNN Architectures for Classifying Speaker’s Age and Gender},
booktitle={Proceedings of the 10th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 4: BIOSIGNALS, (BIOSTEC 2017)},
year={2017},
pages={112-117},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006096501120117},
isbn={978-989-758-212-7},
}

TY - CONF

JO - Proceedings of the 10th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 4: BIOSIGNALS, (BIOSTEC 2017)
TI - Combining Two Different DNN Architectures for Classifying Speaker’s Age and Gender
SN - 978-989-758-212-7
AU - Abu Mallouh, A.
AU - Qawaqneh, Z.
AU - Barkana, B.
PY - 2017
SP - 112
EP - 117
DO - 10.5220/0006096501120117

Login or register to post comments.

Comments on this Paper: Be the first to review this paper.