Kurdish Dialects and Neighbor Languages Automatic Recognition
Abstract
Dialect recognition is one of the most hot topics in the speech analysis area. In this study a system for dialect and language recognition is developed using phonetic and a style based features. The study suggests a new set of feature using one-dimensional LBP feature. The results show that the proposed LBP set of feature is useful to improve dialect and language recognition accuracy. The acquired data involved in this study are three Kurdish dialects (Sorani, Badini and Hawrami) with three neighbor languages (Arabic, Persian and Turkish). The study proposed a new method to interpret the closeness of the Kurdish dialects and their neighbor languages using confusion matrix and a non-metric multi-dimensional visualization technique. The result shows that the Kurdish dialects can be clustered and linearly separated from the neighbor languages.
Downloads
References
Abdul, Z.K., Al-Talabani, A. and Abdulrahman, A.O., 2016. A New Feature Extraction Technique Based on 1D Local Binary Pattern for Gear Fault Detection. Shock and Vibration, 2016.
Ahonen, T., Hadid, A. and Pietikainen, M., 2006. Face description with local binary patterns: Application to face recognition. IEEE transactions on pattern analysis and machine intelligence, 28(12), pp.2037-2041
Bahari, M.H., Dehak, N., Burget, L., Ali, A.M. and Glass, J., 2014. Non-negative factor analysis of gaussian mixture model weight adaptation for language and dialect recognition. IEEE/ACM transactions on audio, speech, and language processing, 22(7), pp.1117-1129.
Chatlani, N. and Soraghan, J.J., 2010, August. Local binary patterns for 1-D signal processing. In Signal Processing Conference, 2010 18th European(pp. 95-99). IEEE
Chen, N.F., Shen, W. and Campbell, J.P., 2010, March. A linguistically-informative approach to dialect recognition using dialect-discriminating context-dependent phonetic models. In 2010 IEEE International Conference on Acoustics, Speech and Signal Processing (pp. 5014-5017). IEEE.
Chen, N.F., Shen, W., Campbell, J.P. and Torres-Carrasquillo, P.A., 2011, May. Informative dialect recognition using context-dependent pronunciation modeling. In 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 4396-4399). IEEE.
Choueiter, G., Zweig, G. and Nguyen, P., 2008, March. An empirical study of automatic accent classification. In 2008 IEEE International Conference on Acoustics, Speech and Signal Processing (pp. 4265-4268). IEEE.
Diakoloukas, V., Digalakis, V., Neumeyer, L. and Kaja, J., 1997, April. Development of dialect-specific speech recognizers using adaptation methods. In Acoustics, Speech, and Signal Processing, 1997. ICASSP-97. 1997 IEEE International Conference on (Vol. 2, pp. 1455-1458). IEEE.
Hassan, A. and Damper, R.I., 2012. Classification of emotional speech using 3DEC hierarchical classifier. Speech Communication, 54(7), pp.903-916
Hirayama, N., Yoshino, K., Itoyama, K., Mori, S. and Okuno, H.G., 2015. Automatic speech recognition for mixed dialect utterances by mixing dialect language models. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 23(2), pp.373-382.
Huang, R. and Hansen, J.H., 2007. Unsupervised discriminative training with application to dialect classification. IEEE Transactions on Audio, Speech, and Language Processing, 15(8), pp.2444-2453.
Patil, H.A. and Basu, T.K., 2009, February. A Novel Modified Polynomial Network Design for Dialect Recognition. In Advances in Pattern Recognition, 2009. ICAPR'09. Seventh International Conference on (pp. 175-178). IEEE.
Chougule, S.V. and Chavan, M.S., 2014. Channel Robust MFCCs for Continuous Speech Speaker Recognition. In Advances in Signal Processing and Intelligent Recognition Systems (pp. 557-568). Springer International Publishing.
Guo, Z., Zhang, L. and Zhang, D., 2010. Rotation invariant texture classification using LBP variance (LBPV) with global matching. Pattern recognition, 43(3), pp.706-719.
Yang, B. and Chen, S., 2013. A comparative study on local binary pattern (LBP) based face recognition: LBP histogram versus LBP image. Neurocomputing, 120, pp.365-379.
Copyright (c) 2017 Abdulbasit K. Al-Talabani, Zrar Kh. Abdul, Azad A. Ameen
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Authors who choose to publish their work with Aro agree to the following terms:
-
Authors retain the copyright to their work and grant the journal the right of first publication. The work is simultaneously licensed under a Creative Commons Attribution License [CC BY-NC-SA 4.0]. This license allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
-
Authors have the freedom to enter into separate agreements for the non-exclusive distribution of the journal's published version of the work. This includes options such as posting it to an institutional repository or publishing it in a book, as long as proper acknowledgement is given to its initial publication in this journal.
-
Authors are encouraged to share and post their work online, including in institutional repositories or on their personal websites, both prior to and during the submission process. This practice can lead to productive exchanges and increase the visibility and citation of the published work.
By agreeing to these terms, authors acknowledge the importance of open access and the benefits it brings to the scholarly community.