Gender Prediction of Journalists from Writing Style

Peshawa J. Muhammad Ali, Nigar M. Shafiq Surameery, Abdul-Rahman Mawlood Yunis, Ladeh Sardar Abdulrahman


Web-based Kurdish media have seen a tangible growth in the last few years. There are many factors that have contributed into this rapid growth. These include an easy access to the internet connection, the low price of electronic gadgets and pervasive usage of social networking. The swift development of the Kurdish web-based media imposes new challenges that need to be addressed. For example, a newspaper article published online possesses properties such as author name, gender, age, and nationality among others. Determining one or more of these properties, when ambiguity arises, using computers is an important open research area. In this study the journalist’s gender in web-based Kurdish media determined using computational linguistic and text mining techniques. 75 web-based Kurdish articles used to train artificial model designed to determine the gender of journalists in web-based Kurdish media. Articles were downloaded from four different well known web-based Kurdish newspapers. 61 features were extracted from each article; these features are distinct in discriminating between genders. The Multi-Layer Perceptron (MLP) artificial neural network is used as a classification technique and the accuracy received were 76%.


Gender identification, Kurdish media, Neural networks, Text mining

Full Text:



Burger, J., Henderson, J., Kim, G. and Zarrella, G., 2011. Discriminating gender on Twitter. In: Association for Computational Linguistics, In: Conference on empirical methods in natural language processing, 27-31 July 2011. Edinburgh, Scotland, UK.

Cheng, N., Chandramouli, R. and Subbalakshmi, K.P., 2011. Author gender identification from text. Digital Investigation, 8(1), pp.78-88.

Cheng, N., Chen, X., Chandramouli, R., and Subbalakshmi, K., 2009. Gender identification from e-mails. In: IEEE, IEEE Symposium on computational linguistics and data mining proceedings, 30-2 April 2009. Nashville, TN, USA.

Deitrick, W., Miller, Z., Valyou, B., Dickinson, B., Munson, T. and Hu, W., 2012. Gender identification on Twitter using the modified balanced winnow. Communications and Network, 4(3), pp.189-195.

Efron, R., and Thisted, B., 1976. Estimating the number of unseen species: How many words did Shakespeare know?. Biometrika, 63(3), pp.435-447.

Esugasini, S., Mashor, M., Isa, N. and Othman, N., 2005. Performance comparison for MLP networks using various backpropagation algorithms for breast cancer diagnosis. In: 9th International conference on knowledge-based intelligent information and engineering systems (KES'05), 14-16 September 2005. Australia.

Herdağdelen, A., 2013. Twitter n-gram corpus with demographic metadata, Language resources and evaluation, pp.1-21.

Labov, W., 1990. The intersection of sex and social class in the course of linguistic change. Language variation and change, 2, pp.205-254.

Lakoff, R., 1973. Language and women’s place. Language in society, 2(1), pp.45-80.

Merriam, T., 1996. Marlowe’s hand in Edward III revisited. Literary and linguistic computing, 11(1), pp.19-22.

Nguyen, T., Phung, D., Adams, B. and Venkatesh, S., 2011. Prediction of age, sentiment, and connectivity from social media text. In: WISE (Web Information System Engineering), In: 12th International conference on web information system engineering (WISE'11), 12-14 October 2011. Sydney, Australia.

Yunis, A.M., 2012. Towards an application programming interface (API) for processing Kurdish text. [pdf] Canada: Carlton University research group web-site, Available at:

View Counter: Abstract | 682 | and PDF | 318 |

Article Metrics

Metrics Loading ...

Metrics powered by PLOS ALM


  • There are currently no refbacks.

Copyright (c) 2016 Peshawa J. Muhammad Ali, Nigar M. Shafiq Surameery, Abdul-Rahman Mawlood Yunis, Ladeh Sardar Abdulrahman

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.


ARO Journal is an OAJ supported by Koya University, it has no article submission/processing charges (APCs).
© 2013-2019, Koya University is a public University accredited by the Ministry of Higher Education and Scientific Research, KRG - F.R. Iraq.