Objective Gender and Age Recognition from Speech Sentences

Fatima K. Faek

doi:10.14500/aro.10072

Authors

Fatima K. Faek Electrical Engineering Department, Engineering college, Sallahaddin University, Kurdistan Region.

DOI:

https://doi.org/10.14500/aro.10072

Keywords:

Age classification from speech, gender classification from speech, MFCC based gender and age recognition, SVM classifier.

Abstract

In this work, an automatic gender and age recognizer from speech is investigated. The relevant features to gender recognition are selected from the first four formant frequencies and twelve MFCCs and feed the SVM classifier. While the relevant features to age has been used with k-NN classifier for the age recognizer model, using MATLAB as a simulation tool. A special selection of robust features is used in this work to improve the results of the gender and age classifiers based on the frequency range that the feature represents. The gender and age classification algorithms are evaluated using 114 (clean and noisy) speech samples uttered in Kurdish language. The model of two classes (adult males and adult females) gender recognition, reached 96% recognition accuracy. While for three categories classification (adult males, adult females, and children), the model achieved 94% recognition accuracy. For the age recognition model, seven groups according to their ages are categorized. The model performance after selecting the relevant features to age achieved 75.3%. For further improvement a de-noising technique is used with the noisy speech signals, followed by selecting the proper features that are affected by the de-noising process and result in 81.44% recognition accuracy.

Downloads

Download data is not yet available.

Author Biography

Fatima K. Faek, Electrical Engineering Department, Engineering college, Sallahaddin University, Kurdistan Region.

Fatima K. Faek is a lecturer at the department of Electrical Engineering, Salahaddin University. She received the B.Sc. degree in Electrical Engineering from Salahaddin University in 1993, M.Sc. degree in Signal Processing (speech processing) from University of Salahaddin in 2006. She started her academic teaching in 2006, as an Assistant Lecturer in the department of Electrical Engineering. Miss Faek is a consultant Engineer at the Kurdistan Engineering Union. Her research interests are; the speech, image, and video Processing. She has 6 published journal papers.

References

Bahari, M.H. and Van hamme, H., 2011. Speaker age estimation and gender detection based on supervised non-negative matrix factorization, In: IEEE, IEEE Workshop on Biometric Measurements and Systems for Security and Medical Applications, Italy, 28 September 2011. USA.

Bocklet, T., Maier, A. and North, E., 2008. Age Detection of Children in Preschool and primary School Age with GMM-Based Super vector and Support Vector Machines/regression. In: 11th International Conference, TSD 2008, Bron, Czech Republic, 8-12 September 2008.

Dobry, G., Hetch, M., Avegal, M., and Zigel, Y., 2011. Super vector Dimension Reduction for Efficient Speaker Age Estimation Based on the Acoustic Speech Signal, IEEE Trans. Audio, Speech and Language Processing, 19(7), pp.1975–1985.

Faek, F.K., Al-Talabani, A.K., 2013. Speaker recognition from noisy spoken sentences, International Journal of Computer Applications. 70(20), pp.11-14.

Metze, F., Ajmera, J., Englert, R., Bub, U., Burkhardt, F., Stegmann, J., Muller, C., Huber, R., Andrassy, B., Bauer, J.G. and Littel, B., 2007. Comparison of four approaches to age and gender recognition for telephone applications, In: IEEE, IEEE international conference on Acoustics, Speech and Signal Processing (ICASSP), Honolulu, 15-20 April 2007. USA.

Golfer, M. and Mikes, V. 2005. The relative contributions of speaking fundamental frequency and formant frequencies to gender identification based on isolated vowels, Journal of Voice, 19 (4), pp.544-554.

Harnsberger, J.D., Shrivastav, R., Brown, W.S., Rothman, H. and Hollien, H., 2008. Speaking rate and fundamental frequency as speech cues to perceived age, Journal of Voice, 22(1), pp.58-69.

Hugo, M. and Isabel, T., 2011. Age and gender detection in the I-DASH project ACM, Transactions on Speech and Language Processing, 7(4), 16 pages. DOI 10.1145/1998384.1998387.

Li, M., Han, K.J. and Narayanan, S., 2012. Automatic speaker age and gender recognition using acoustic and prosodic level information fusion. Computer Speech and Language, 27, pp.151-167.

Mirhassani, S.M., Zourmand, A. and Ting, H.N., 2014. Age Estimation Based on Children's Voice: A Fuzzy-Based Decision Fusion Strategy. Scientific World Journal, [online] Available at:

< http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4070543/> [Accessed 22 November 2014].

Potamianos, A. and Narayanan, S., 2003. Robust recognition of children’s speech. IEEE Trans. Speech Audio Processing, 11(6), pp.603–616.

Santosh, G., Bharti, G. and Mehrotra, S.C., 2012. Gender identification using SVM with Combination of MFCC, Advances in Computational Research, 4(1), pp.69-73.

SAS. J. and SAS., A., 2013. Gender recognition using neural network and ASR techniques, Journal of medical information and technologies, 22, pp.179-187.

Sedaaghi, M.H., 2009. A comparative study of gender and age classification in speech signals, Iranian Journal of Electrical & Electronic Engineering, 5(1), pp.1-12.

Thomas, P., Vahid, H., Isabel, T., Annika, H., Miguel, S., 2014, Speaker age estimation for elderly speech recognition in European Portuguese. In: The 15th Annual Conference of the International Speech Communication Association - INTERSPEECH 2014, Singapore, 14-18 September 2014.

Tiwari, V., Ganga, G., Singhai, J. and Azad, M., 2011. Wavelet based noise robust features for speaker recognition, Signal Processing: An International Journal (SPIJ), 5(2), pp.52-64.