High Security and Capacity of Image Steganography for Hiding Human Speech Based on Spatial and Cepstral Domains

  • Yazen A. Khaleel Department of Software Engineering, Faculty of Engineering, Koya University, Danielle Mitterrand Boulevard, Koya KOY45, Kurdistan Region http://orcid.org/0000-0001-5658-1164
Keywords: Image steganography, Mel-frequency cepstral coefficients, Speech reconstruction


A new technique of hiding a speech signal clip inside a digital color image is proposed in this paper to improve steganography security and loading capacity. The suggested technique of image steganography is achieved using both spatial and cepstral domains, where the Mel-frequency cepstral coefficients (MFCCs) are adopted, as very efficient features of the speech signal. The presented technique in this paper contributes to improving the image steganography features through two approaches. First is to support the hiding capacity by the usage of the extracted MFCCs features and pitches extracted from the speech signal and embed them inside the cover color image rather than directly hiding the whole samples of the digitized speech signal. Second is to improve the data security by hiding the secret data (MFCCs features) anywhere in the host image rather than directly using the least significant bits substitution of the cover image. At the recovering side, the proposed approach recovers these hidden features and using them to reconstruct the speech waveform again by inverting the steps of MFCCs extraction to recover an approximated vocal tract response and combine it with recovered pitch based excitation signal. The results show a peak signal to noise ratio of 52.4 dB of the stego-image, which reflect a very good quality and a reduction ratio of embedded data to about (6%–25%). In addition, the results show a speech reconstruction degree of about 94.24% correlation with the original speech signal.


Download data is not yet available.

Author Biography

Yazen A. Khaleel, Department of Software Engineering, Faculty of Engineering, Koya University, Danielle Mitterrand Boulevard, Koya KOY45, Kurdistan Region

Yazen A. Khaleel is an assistant professor; he is an academic staff member in the department of Software Engineering - Faculty of Engineering at Koya University. In 1995 He got his B.Sc. degree in electrical and electronics engineering. He has MSc degree in Signal processing and PhD degree in Electronics and Digital Signal processing. He is a member in Iraqi Engineers syndicate and in Kurdistan Engineering Union. Dr. Yazen has (9) published Journal articles and (3) conference papers [email: yazen.adnan@koyauniversity.org]. [see TAP]


Abdulraman, L.S., Hma-Salah, S.R., Maghidid, H.S. and Sabir, A.T., 2019. Arobust way of steganography by using blocks of an image in spatial domain.Innovaciencia, 7(1), pp.1-7.

Al-Qershi, O.M. and Khoo, B.E., 2011. High capacity data hiding schemes formedical images based on difference expansion. Journal of Systems and Software,84(1), pp.105-112.

Chakroborty, S., Roy, A. and Saha, G., 2007. Improved closed set textindependent speaker identification by combining mfcc with evidence from flipped filter banks. International Journal of Signal Processing, 4(2), pp.114-122.

Cox, I.J., Miller, M.L., Bloom, J.A., Fridrich, J. and Kalker, T., 2008. DigitalWatermarking and Steganography. 2nd ed. Morgan Kaufmann Publishers, USA.

Davis, S. and Mermelstein, P., 1980. Comparison of Parametric Representationsfor Monosyllable Word Recognition in Continuously Spoken Sentences, 28(4), pp. 357-366.

Huang, X., Acero, A. and Hon, H.W., 2001. Spoken Language Processing. Prentice Hall, Inc., USA.

Jamel, E.M., 2019. Secure image steganography using biorthogonal wavelet transform. Journal of Engineering and Applied Sciences, 14, pp.9396-9404.

Kleijn, W.B. and Paliwal, K.K., 1995. Speech Coding and Synthesis. Elsevier Science Inc., New York, United States.

Navas, K., Thampy, S.A., Sasikumar, M. 2008. EPR hiding in medical images for telemedicine. International Journal of Biomedical Science, 2(1), pp.292-295.

Nipanikar, S.I., Deepthi, V.H. and Kulkarni, N., 2017. A sparse representationbased image steganography using particle swarm optimization and wavelet transform. Alexandria Engineering Journal, 57, 2343-2356.

Oliveira, M.L.L., Cerqueira, J.J.F. and Filho, E.F.S., 2018. Simulation of an Artificial Hearing Module for an Assistive Robot. Intelligent Systems Conference, London, UK.

Saroj, N. and Dewangan, S.K., 2018. An implementation of hiding audio secure data in images using steganography. Journal of Emerging Technologies and Innovative Research, 5(9), 20-25.

Sharma, D., 2015. Steganography of Speech Signal into an Image. Second International Conference on Recent Advances in Engineering and Computational Sciences, USA.

How to Cite
Khaleel, Y. A. (2020) “High Security and Capacity of Image Steganography for Hiding Human Speech Based on Spatial and Cepstral Domains”, ARO-THE SCIENTIFIC JOURNAL OF KOYA UNIVERSITY, 8(1), pp. 95-106. doi: 10.14500/aro.10670.