A Systematic Survey on Large Language Models for Static Code Analysis

Authors

DOI:

https://doi.org/10.14500/aro.12082

Keywords:

Large language models, Software metrics, Software quality, Static code analysis

Abstract

Static code analysis aids in improving software quality, security, and maintainability by detecting vulnerabilities, errors, and programming issues in source code without executing it. The latest advancements in Artificial Intelligence (AI), especially the development of Large Language Models (LLMs) such as ChatGPT, have enabled transformational opportunities in this domain. Thus, it is essential to explore this hot field of research alongside many directions. This systematic survey focuses on the use of LLMs for static code analysis, detailing their applications, advantages, contexts, limitations, etc. In this study, the research papers that have been published on the topic from well-known literature databases were examined to answer several research questions regarding state-of-the-art use of LLMs for static code analysis. Also, different research gaps and challenges were identified and discussed alongside many directions. The results of this study demonstrate how LLMs can be useful for static code analysis and overcome different constraints. Thus, it opens the doors for developers and researchers to employ LLMs for affordable and effective static code analysis to improve software development process.

Downloads

Download data is not yet available.

Author Biographies

Hekar A. Mohammed Salih, Department of Computer Science, College of Science, University of Duhok, Zakho Street 38 AJ, Duhok, Kurdistan Region - Iraq

Hekar A. Mohammed Salih is a M.Sc. student at the Department of Computer Science, College of Science, Duhok University. He got the B.Sc. degree in Computer Science and Information Technology. His research interests are in Software Engineering, LLMs, and AI/ML.  

Qusay I. Sarhan, Department of Computer Science, College of Science, University of Duhok, Zakho Street 38 AJ, Duhok, Kurdistan Region - Iraq

Qusay I. Sarhan is an Assistant Professor at the Department of Computer Science, College of Science, Duhok University. He got the B.Sc. degree in Software Engineering, the M.Tech. degree in Software Engineering and the Ph.D. degree in Software Engineering. His research interests are in Software Engineering, Internet of Things, and AI/ML.

References

Acl, A., 2024. An Empirical Study of LLM for Code Analysis: Understanding Syntax and Semantics. ACL ARR. Available from: https://openreview.net/forum?id=yezazwj1yf> [Last assessed on 2025 Jan 04].

Akuthota, V., Kasula, R., Sumona, S.T., Mohiuddin, M., Reza, M.T., and Rahman, M.M., 2023. Vulnerability detection and monitoring using LLM. In: Proceedings of 2023 IEEE 9th International Women in Engineering (WIE) Conference on Electrical and Computer Engineering, WIECON-ECE 2023, IEEE, United States, pp.309-314. DOI: https://doi.org/10.1109/WIECON-ECE60392.2023.10456393

AlOmar, E.A., and Mkaouer, M.W., 2024. Cultivating software quality improvement in the classroom: An experience with chatGPT. In: 2024 36th International Conference on Software Engineering Education and Training (CSEE&T). IEEE, United States, pp.1-10. DOI: https://doi.org/10.1109/CSEET62301.2024.10663028

Amburle, A., Almeida, C., Lopes, N., and Lopes, O., 2024. AI based code error explainer using gemini model. In: 2024 3rd International Conference on Applied Artificial Intelligence and Computing (ICAAIC). IEEE, United States: pp.274-278. DOI: https://doi.org/10.1109/ICAAIC60222.2024.10574931

Ardito, L., Ballario, M., and Valsesia, M., 2023. Research, Implementation and Analysis of Source Code Metrics in Rust-Code-Analysis. In: 2023 IEEE 23rd International Conference on Software Quality, Reliability, and Security (QRS). IEEE, United States, pp.497-506. DOI: https://doi.org/10.1109/QRS60937.2023.00055

Bairi, R., Sonwane, A., Kanade, A., Vageesh, D.C., Iyer, A., Parthasarathy, S., Rajamani, S., Ashok, B., and Shet, S., 2024. Codeplan: Repository-level coding using LLMs and planning. Proceedings of the ACM on Software Engineering, 1, pp.675-698. DOI: https://doi.org/10.1145/3643757

Bajpai, Y., Chopra, B., Biyani, P., Aslan, C., Coleman, D., Gulwani, S., Parnin,C., Radhakrishna, A., and Soares, G., 2024. Let’s fix this together: Conversational debugging with github copilot. In: 2024 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC). IEEE, United States, pp.1-12. DOI: https://doi.org/10.1109/VL/HCC60511.2024.00011

Brereton, P., Kitchenham, B.A., Budgen, D., Turner, M., and Khalil, M., 2007. Lessons from applying the systematic literature review process within the software engineering domain. Journal of Systems and Software, 80(4), pp.571-583. DOI: https://doi.org/10.1016/j.jss.2006.07.009

Chen, Y., Sun, W., Fang, C., Chen, Z., Ge, Y., Han, T., Zhang, Q., Liu, Y., Chen, Z., and Xu, B., 2024. Security of Language Models for Code: A Systematic Literature Review. Vol. 1. Available from: https://arxiv.org/abs/2410.15631 [Last assessed on 2025 Jan 04].

Di, P., Li, J., Yu, H., Jiang, W., Cai, W., Cao, Y., Chen, C., Chen, D., Chen, H., Chen, L., Fan, G., Gong, J., Gong, Z., Hu, W.,... & Zhu, X., 2023. CodeFuse-13B: A pretrained multi-lingual code large language model. Proceedings of the 46th International Conference on Software Engineering: Software Engineering in Practice. ICSE, New Delhi, pp.418-429. DOI: https://doi.org/10.1145/3639477.3639719

Fang, C., Miao, N., Srivastav, S., Liu, J., Zhang, R., Fang, R., Asmita, Tsang, R., Nazari, N., Wang, H., and Homayoun, H., 2023. Large Language Models for Code Analysis: Do LLMs Really do Their Job? Available from: https://arxiv.org/abs/2310.12357 [Last assessed on 2025 Jan 04].

Gong, J., Voskanyan, V., Brookes, P., Wu, F., Jie, W., Xu, J., Giavrimis, R., Basios, M., Kanthan, L., and Wang, Z., 2025. Language Models for Code Optimization: Survey, Challenges and Future Directions. Vol. 1. ACM Computing Surveys. [arxiv Preprint]. Available from: https://arxiv.org/

abs/2501.01277 [Last assessed on 2025 Jan 04].

Gonzalez-Barahona, J.M., 2024. Software development in the age of LLMs and XR. In: Proceedings of the 1st ACM/IEEE Workshop on Integrated Development Environments. ACM, New York, USA, pp.66-69. DOI: https://doi.org/10.1145/3643796.3648457

Guo, Q., Cao, J., Xie, X., Liu, S., Li, X., Chen, B., and Peng, X., 2023a. Exploring the potential of chatGPT in automated code refinement: An empirical study. In: Proceedings of the IEEE/ACM 46th International Conference on Software Engineering. pp.1-13. DOI: https://doi.org/10.1145/3597503.3623306

Guo, Z., Tan, T., Liu, S., Liu, X., Lai, W., Yang, Y., Li, Y., Chen, L., Dong, W., and Zhou, Y., 2023b. Mitigating false positive static analysis warnings: Progress, challenges, and opportunities. IEEE Transactions on Software Engineering, 49(12), pp.5154-5188. DOI: https://doi.org/10.1109/TSE.2023.3329667

Gupta, N.K., Chaudhary, A., Singh, R., and Singh, R., 2023. ChatGPT: Exploring the capabilities and limitations of a large language model for conversational AI. In: 2023 International Conference on Advances in Computation, Communication and Information Technology (ICAICCIT). IEEE, United States, pp.139-142. DOI: https://doi.org/10.1109/ICAICCIT60255.2023.10465811

Haindl, P., and Weinberger, A.G., 2024. Does chatGPT help novice programmers write better code? Results from static code analysis. IEEE Access, 12, pp.114146-114156. DOI: https://doi.org/10.1109/ACCESS.2024.3445432

Hajipour, H., Hassler, K., Holz, T., Schönherr, L., and Fritz, M., 2024. CodeLMSec benchmark: Systematically evaluating and finding security vulnerabilities in black-box code language models. In: 2024 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML). IEEE, United States, pp.684-709. DOI: https://doi.org/10.1109/SaTML59370.2024.00040

Hassan, H.B., Sarhan, Q.I., and Beszédes, Á., 2024. Evaluating python static code analysis tools using FAIR principles. IEEE Access, 12, pp.173647-173659. DOI: https://doi.org/10.1109/ACCESS.2024.3503493

Hort, M., Grishina, A., and Moonen, L., 2023. An exploratory literature study on sharing and energy use of language models for source code. In: 2023 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM). IEEE, United States, pp.1-12. DOI: https://doi.org/10.1109/ESEM56168.2023.10304803

Hossain, A.A., Mithun Kumar, P.K., Zhang, J., and Amsaad, F., 2024. Malicious code detection using LLM. In: NAECON 2024 - IEEE National Aerospace and Electronics Conference. IEEE, United States, pp.414-416. DOI: https://doi.org/10.1109/NAECON61878.2024.10670668

Hou, X., Zhao, Y., Liu, Y., Yang, Z., Wang, K., Li, L., Luo, X., Lo, D., Grundy,J., and Wang, H., 2023. Large Language Models for Software Engineering: A Systematic Literature Review. pp.1-79. Available from: https://arxiv.org/abs/2308.10620 [Last assessed on 2025 Jan 04]. DOI: https://doi.org/10.1145/3695988

Ignatyev, V.N., Shimchik, N.V., Panov, D.D., and Mitrofanov, A.A., 2024. Large language models in source code static analysis. In: 2024 Ivannikov Memorial Workshop (IVMEM). IEEE, United States, pp.28-35. DOI: https://doi.org/10.1109/IVMEM63006.2024.10659715

Jesse, K., Ahmed, T., Devanbu, P.T., and Morgan, E., 2023. Large language models and simple, stupid bugs. In: 2023 IEEE/ACM 20th International Conference on Mining Software Repositories (MSR). IEEE, United States, pp.563-575. DOI: https://doi.org/10.1109/MSR59073.2023.00082

Kotenko, I., Izrailov, K., and Buinevich, M., 2022. Static analysis of information systems for IoT cyber security: Asurvey of machine learning approaches. Sensors (Basel), 22(4), p.1335. DOI: https://doi.org/10.3390/s22041335

Li, H., and Shan, L., 2023. LLM-based vulnerability detection. In: 2023 International Conference on Human-Centered Cognitive Systems (HCCS).IEEE, United States, pp.1-4. DOI: https://doi.org/10.1109/HCCS59561.2023.10452613

Li, H., Hao, Y., Zhai, Y., and Qian, Z., 2023. Assisting static analysis with large language models: AchatGPT experiment. In: Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. ACM, New York, USA, pp.2107-2111. DOI: https://doi.org/10.1145/3611643.3613078

Liu, Z., Yang, Z., and Liao, Q., 2024. Exploration on prompting LLM with code-specific information for vulnerability detection. Proceedings - 2024 IEEE International Conference on Software Services Engineering, SSE 2024. IEEE, United States, pp.273-281. DOI: https://doi.org/10.1109/SSE62657.2024.00049

Louridas, P., 2006. Static code analysis. IEEE Software, 23(4), pp.58-61. Mahyari, A.A., 2024. Harnessing the power of LLMs in source code vulnerability detection. In: MILCOM 2024 - 2024 IEEE Military Communications Conference (MILCOM). IEEE, United States, pp.251-256. DOI: https://doi.org/10.1109/MILCOM61039.2024.10774025

Mathews, N.S., Brus, Y., Aafer, Y., Nagappan, M., and McIntosh, S., 2024. LLbezpeky: Leveraging Large Language Models for Vulnerability Detection. Available from: https://arxiv.org/abs/2401.01269 [Last assessed on 2025 Jan 04].

Mohajer, M.M., Aleithan, R., Harzevili, N.S., Wei, M., Belle, A.B., Pham, H.V., and Wang, S., 2023. SkipAnalyzer: A Tool for Static Code Analysis with Large Language Models. Available from: https://arxiv.org/abs/2310.18532 [Last assessed on 2025 Jan 04].

Moratis, K., Diamantopoulos, T., Nastos, D.N., and Symeonidis, A., 2024. Write me this code: An analysis of chatGPT quality for producing source code. In: Proceedings- 2024 IEEE/ACM 21st International Conference on Mining Software Repositories, MSR 2024, pp.147-151. DOI: https://doi.org/10.1145/3643991.3645070

Omar, M., and Shiaeles, S., 2023. VulDetect: A novel technique for detecting software vulnerabilities using language models. In: 2023 IEEE International Conference on Cyber Security and Resilience (CSR). IEEE, United States, pp.105-110. DOI: https://doi.org/10.1109/CSR57506.2023.10224924

Pearce, H., Tan, B., Ahmad, B., Karri, R., and Dolan-Gavitt, B., 2023. Examining zero-shot vulnerability repair with large language models. In: 2023 IEEE Symposium on Security and Privacy (SP). IEEE, United States, pp.2339-2356. DOI: https://doi.org/10.1109/SP46215.2023.10179324

Petersen, K., Vakkalanka, S., and Kuzniarz, L., 2015. Guidelines for conducting systematic mapping studies in software engineering: An update. Information and Software Technology, 64(5), pp.1-18. DOI: https://doi.org/10.1016/j.infsof.2015.03.007

Purba, M.D., Ghosh, A., Radford, B.J., and Chu, B., 2023. Software vulnerability detection using large language models. In: 2023 IEEE 34th International Symposium on Software Reliability Engineering Workshops (ISSREW). IEEE, United States, pp.112-119. DOI: https://doi.org/10.1109/ISSREW60843.2023.00058

Rahmaniar, W., 2024. ChatGPT for software development: Opportunities and challenges. IT Professional, 26(3), pp.80-86. DOI: https://doi.org/10.1109/MITP.2024.3379831

Ramamoorthy, J., Gupta, K., Kafle, R.C., Shashidhar, N.K., and Varol, C., 2024. A novel static analysis approach using system calls for linux IoT malware detection. Electronics, 13(15), p.2906. DOI: https://doi.org/10.3390/electronics13152906

Ságodi, Z., Siket, I., and Ferenc, R., 2024. Methodology for code synthesis evaluation of LLMs presented by a case study of chatGPT and copilot. IEEE Access, 12, pp.72303-72316. DOI: https://doi.org/10.1109/ACCESS.2024.3403858

Salem, N., Hudaib, A., Al-Tarawneh, K., Salem, H., Tareef, A., Salloum, H., and Mazzara, M., 2024. Asurvey on the application of large language models in software engineering. Computer Research and Modeling, 16(7), pp.1715-1726. DOI: https://doi.org/10.20537/2076-7633-2024-16-7-1715-1726

Sikand, S., Mehra, R., Sharma, V.S., Kaulgud, V., Podder, S., and Burden, A.P., 2024. Do generative AI tools ensure green code? An investigative study. In: Proceedings of the 2nd International Workshop on Responsible AI Engineering. ACM, New York, USA, pp.52-55. DOI: https://doi.org/10.1145/3643691.3648588

Souma, N., Ito, W., Obara, M., Kawaguchi, T., Akinobu, Y., Kurabayashi, T., Tanno, H., and Kuramitsu, K., 2023. Can chatGPT correct code based on logical steps. In: Proceedings- Asia-Pacific Software Engineering Conference, APSEC, pp.653-654. DOI: https://doi.org/10.1109/APSEC60848.2023.00094

Venkatesh, A.P.S., Sabu, S., Mir, A.M., Reis, S., and Bodden, E., 2024. The emergence of large language models in static analysis: A first look through micro-benchmarks. In: Proceedings of the 2024 IEEE/ACM First International Conference on AI Foundation Models and Software Engineering. ACM, New York, USA, pp.35-39. DOI: https://doi.org/10.1145/3650105.3652288

Villmow, J., Campos, V., Petry, J., Abbad-Andaloussi, A., Ulges, A., and Weber, B., 2023. How well can masked language models spot identifiers that violate naming guidelines? In: 2023 IEEE 23rd International Working Conference on Source Code Analysis and Manipulation (SCAM). IEEE, United States, pp.131-142. DOI: https://doi.org/10.1109/SCAM59687.2023.00023

Wadhwa, N., Pradhan, J., Sonwane, A., Sahu, S.P., Natarajan, N., Kanade, A., Parthasarathy, S., and Rajamani, S., 2024. CORE: Resolving code quality issues using LLMs. In: Proceedings of the ACM on Software Engineering. Vol. 1. ACM, United States, pp.789-811. DOI: https://doi.org/10.1145/3643762

Wang, J., Huang, Y., Chen, C., Liu, Z., Wang, S., and Wang, Q., 2024. Software testing with large language models: Survey, landscape, and vision. IEEE Transactions on Software Engineering, 50(4), pp.911-936. DOI: https://doi.org/10.1109/TSE.2024.3368208

Wohlin, C., 2014. Guidelines for snowballing in systematic literature studies and a replication in software engineering. In: ACM International Conference Proceeding Series. ACM, United States. DOI: https://doi.org/10.1145/2601248.2601268

Yin, X., Ni, C., and Wang, S., 2024. Multitask-based evaluation of open-source LLM on software vulnerability. IEEE Transactions on Software Engineering, 50(11), pp.3071-3087. DOI: https://doi.org/10.1109/TSE.2024.3470333

Yuan, Z., Liu, M., Ding, S., Wang, K., Chen, Y., Peng, X., and Lou, Y., 2024. Evaluating and improving chatGPT for unit test generation. Proceedings of the ACM on Software Engineering. 1(FSE), pp.1703-1726. DOI: https://doi.org/10.1145/3660783

Zhang, Q., Fang, C., Xie, Y., Zhang, Y., Yang, Y., Sun, W., Yu, S., and Chen, Z., 2023a. A Survey on Large Language Models for Software Engineering. Available from: https://arxiv.org/abs/2312.15223 [Last assessed on 2025 Jan 04].

Zhang, Z., Chen, C., Liu, B., Liao, C., Gong, Z., Yu, H., Li, J., and Wang, R., 2023b. Unifying the Perspectives of NLP and Software Engineering: A Survey on Language Models for Code. pp.1-99. Available from: https://arxiv.org/abs/2311.07989 [Last assessed on 2025 Jan 04].

Zheng, Z., Ning, K., Wang, Y., Zhang, J., Zheng, D., Ye, M., and Chen, J., 2023a. A survey of large language models for code: Evolution, benchmarking, and future trends. ACM Transactions on Software Engineering and Methodology, 31(2), pp.1-44.

Zheng, Z., Ning, K., Zhong, Q., Chen, J., Chen, W., Guo, L., Wang, W., and Wang, Y., 2023b. Towards an understanding of large language models in software engineering tasks. Empirical Software Engineering, 30(2), p.50. DOI: https://doi.org/10.1007/s10664-024-10602-0

Zhou, X., Cao, S., Sun, X., and Lo, D., 2024. Large Language Model for Vulnerability Detection and Repair: Literature Review and the Road Ahead. Vol. 1. Available from: https://arxiv.org/abs/2404.02525 [Last assessed on2025 Jan 04]. DOI: https://doi.org/10.1145/3708522

Published

2025-06-22

How to Cite

Mohammed Salih, H. A. and Sarhan, Q. I. (2025) “A Systematic Survey on Large Language Models for Static Code Analysis”, ARO-THE SCIENTIFIC JOURNAL OF KOYA UNIVERSITY, 13(1), pp. 251–265. doi: 10.14500/aro.12082.

Issue

Section

Review Articles
Received 2025-03-03
Accepted 2025-05-27
Published 2025-06-22

Similar Articles

1 2 3 4 5 6 7 8 9 10 > >> 

You may also start an advanced similarity search for this article.