Kurdish Dialect Recognition using 1D CNN

Keywords: Convolution neural network, Deep learning, Dialect recognition, Machine learning

Abstract

Dialect recognition is one of the most attentive topics in the speech analysis area. Machine learning algorithms have been widely used to identify dialects. In this paper, a model that based on three different 1D Convolutional Neural Network (CNN) structures is developed for Kurdish dialect recognition. This model is evaluated, and CNN structures are compared to each other. The result shows that the proposed model has outperformed the state of the art. The model is evaluated on the experimental data that have been collected by the staff of department of computer science at the University of Halabja. Three dialects are involved in the dataset as the Kurdish language consists of three major dialects, namely Northern Kurdish (Badini variant), Central Kurdish (Sorani variant), and Hawrami. The advantage of the CNN model is not required to concern handcraft as the CNN model is featureless. According to the results, the 1 D CNN method can make predictions with an average accuracy of 95.53% on the Kurdish dialect classification. In this study, a new method is proposed to interpret the closeness of the Kurdish dialects by using a confusion matrix and a non-metric multi-dimensional visualization technique. The outcome demonstrates that it is straightforward to cluster given Kurdish dialects and linearly isolated from the neighboring dialects.

Downloads

Download data is not yet available.

Author Biographies

Karzan J. Ghafoor, Department of Computer Science, College of Science, University of Halabja, Halabja, Kurdistan Region - F.R. Iraq

Karzan J Ghafoor is a Lecturer at the Department of Computer Science, College of Science, Halabja University. He got the B.Sc. degree in Computer System Engineering, the M.Sc. degree in Data Communication. His research interests are in Machin Learning, Data communication, Network analysis and Database. Mr. Karzan is a member of Kurdistan Engineering Union and member of Kurdistan Teacher Union.

Karwan M. Hama Rawf, Department of Computer Science, College of Science, University of Halabja, Halabja, Kurdistan Region - F.R. Iraq

Karwan M Hama Rawf is Assistant Lecturer at the Department of Computer Science, College of Science, University of Halabja. He got the B.Sc. degree in Computer Science, University of Sulaimani, KRG, Iraq and the M.Sc. degree in Computer Science/ Coventry University, United Kingdom. His research interests are in Machine Learning, Cyber Security and Web Development. Karwan is a member advisor of GLP Program at Coventry University /UK since 2011. Also he is a member in (KELTPN) which is a professional, non-governmental network and it is supported by the British Council, a UK registered Cultural Relations Organisation.

Ayub O. Abdulrahman, Department of Computer Science, College of Science, University of Halabja, Halabja, Kurdistan Region - F.R. Iraq

Ayub O. Abdulrahman is a [Assistant Lecturer] at the Department of Computer Science, College of Science, University of Halabja. He got the B.Sc. degree in Computer Engineering, the M.Sc. degree in Electronic and Compuer_Based System Design. His research interests are in Machine learning, Embedded Systems and Internet of Things IoT. Mr. Ayub is a member of Kurdish Society.

Sarkhel H. Taher, Department of Computer Science, College of Science, University of Halabja, Halabja, Kurdistan Region - F.R. Iraq

Sarkhel H.Taher Karim is a Lecturer at the Department of Computer Science, College of Science, University of Halabja. He got the B.Sc. degree in Computer Science, the M.Sc. degree in Computer Science. His research interests are in Machine Learning, Data Science, Social Network Analysis, and Big Data.

References

Abdul, Z.K., 2019. Kurdish speaker identification based on one dimensional convolutional neural network. Computational Methods for Differential Equations, 7(4), pp.566-572.

Acharya, U.R., Oh, S.L., Hagiwara, Y., Tan, J.H., Adam, M., Gertych, A. and Tan, R.S., 2017. A deep convolutional neural network model to classify heartbeats. Computers in Biology and Medicine, 89, pp.389-396.

Ali, A., 2018. Multi-Dialect Arabic Broadcast Speech Recognition. [e-book] University of Edinburgh, Edinburgh p.193. Available from: https://www.era.lib.ed.ac.uk/bitstream/handle/1842/31224/Ali2018.pdf?sequence=1 and is Allowed=y [Last accessed on 2020 Dec 12].

Al-Talabani, A., Abdul Z. and Ameen, A., 2017. Kurdish dialects and neighbor languages automatic recognition. The Scientific Journal of Koya University, 5(1), pp.20-23.

Bahari, M.H., Dehak, N., Burget, L., Ali, A.M. and Glass, J., 2014. Non negative factor analysis of gaussian mixture model weight adaptation for language and dialect recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 22(7), pp.1117-1129.

Chen, N.F., Shen, W. and Campbell, J.P., 2010. A linguistically-informative approach to dialect recognition using dialect-discriminating context-dependent phonetic models. In: 2010 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.5014-5017.

Chen, N.F., Shen, W., Campbell, J.P. and Torres-Carrasquillo, P.A., 2011. Informative Dialect Recognition Using Context-dependent Pronunciation Modeling, ICASSP. In: IEEE The international Conference on Acoustics, Speech, and Signal Processing. pp.4396-4399.

Choueiter, G., Zweig, G. and Patrick, N., 2008. An Empirical Study of Automatic Accent Classification. Microsoft Research One Microsoft Way Redmond, WA 98052, pp.4265-4268.

Das, P.P., Allayear, S.M., Amin, R. and Rahman, Z., 2016. Bangladeshi Dialect Recognition Using Mel Frequency Cepstral Coefficient, Delta, Delta-delta and Gaussian Mixture Model. In: Proceeding 8th International Conference on Advanced Computational Intelligence, pp.359-364.

Diakoloukas, V., Digalakis, V., Neumeyer, L. and Kaja, J., 1997. Development of Dialect-specific Speech Recognizers Using Adaptation Methods, ICASSP. IEEE The International Conference on Acoustics, Speech, and Signal Processing, 2, pp.1455-1458.

Haines., Goleman, D., Boyatzis, R., Mckee, A., 2019. The meaning and process of communication. Journal of Chemical Information and Modeling, 53(9), pp.1689-1699.

Hirayama, N., Yoshino, K., Itoyama, K., Mori, S. and Okuno, H.G., 2015. Automatic speech recognition for mixed dialect utterances by mixing dialect language models. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 23(2), pp.373-382.

Huang R. and Hansen, J.H.L., 2007. Unsupervised discriminative training with application to dialect classification. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 15(8), pp.2444-2453.

Ince, T., Kiranyaz, S., Eren, L. Askar, M. and Gabbouj, M., 2016. Real-time motor fault detection by 1-D convolutional neural networks. IEEE Transactions on Industrial Electronics, 63(11), pp.7067-7075.

Khan, A., Sohail, A., Zahoora, U. and Qureshi, A. S., 2020. A Survey of the Recent Architectures of Deep Convolutional Neural Networks, No. 0123456789. Springer, Netherlands.

Kiranyaz, S., Gastli, A., Ben-Brahim, L., Al-Emadi, N. and Gabbouj, M., 2019. Real-time fault detection and identification for MMC using 1-D convolutional neural networks. IEEE Transactions on Industrial Electronics, 66(11), pp.8760-8771.

Kiranyaz, S., Ince, T. and Gabbouj, M., 2016. Real-time patient-specific ECG classification by 1-D convolutional neural networks. IEEE Transactions on Industrial Electronics, 63(3), pp.664-675.

Kiranyaz, S., Ince, T., Hamila, R. and Gabbouj, M., 2015. Convolutional neural networks for patient-specific ECG classification. Annual International Conference, IEEE Engineering in Medicine and Biology Society, pp.2608-2611.

Masmoudi, A., Bougares, F., Ellouze, M., Estève, Y. and Belguith, L., 2018. Automatic speech recognition system for Tunisian dialect. Language Resources and Evaluation, 52, pp.249-267.

Najafian, M., DeMarco, A., Cox, S. and Russell, M., 2014. Unsupervised model selection for recognition of regional accented speech. Annual Conference of the International Speech Communication Association, INTERSPEECH, pp.2967-2971.

Najafian, M., Hsu, W., Ali, A. and Glass, J., 2017, Automatic speech recognition of Arabic multi-genre broadcast media. In: 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), pp.353-359.

Ying, W., Zhang, L. and Deng, H., 2020, Sichuan dialect speech recognition with deep LSTM network. Frontiers of Computer Science, 14(2), pp.378-387.

Zhang, Q. and. Hansen, J.H.L., 2018. Language/dialect recognition based on unsupervised deep learning. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 26(5), pp.873-882.

Zhang, Q., Ma, Y., Gu, M., Jin, Y., Qi, Z., Ma, X. and Zhou, Q., 2019. End-to-End Chinese Dialects Identification in Short Utterances using CNN-BiGRU. In: 2019 IEEE 8th Joint International Information Technology and Artificial Intelligence Conference (ITAIC), 2019, pp. 340-344.

Published
2021-10-15
How to Cite
Ghafoor, K. J., Hama Rawf, K. M., Abdulrahman, A. O. and Taher, S. H. (2021) “Kurdish Dialect Recognition using 1D CNN”, ARO-THE SCIENTIFIC JOURNAL OF KOYA UNIVERSITY, 9(2), pp. 10-14. doi: 10.14500/aro.10837.