Large Language Model-Based Malware Detection for the Windows Operating System

Authors

  • Charles Clark Department of Computer Sciences, University of Wisconsin-Madison, 1210 W Dayton St, Madison, WI 53706, USA Author
  • Niusen Chen Department of Computer Science & Engineering, University of Nevada, Reno, 1071 Evans Ave, Reno, NV 89512, USA Author

DOI:

https://doi.org/10.65879/3070-5789.2025.01.09

Keywords:

Malware detection, Windows, large language models

Abstract

Malware detection in Windows systems remains challenging due to the rapid evolution and increasing complexity of malicious programs. Traditional static, dynamic, and machine learning approaches struggle to adapt to new or obfuscated threats. In this work, we propose a large language model (LLM) based framework for detecting malware at the application layer of the Windows operating system. By learning behavioral patterns from system calls triggered during program execution, the proposed framework allows the LLM to capture the semantic relationships between normal and malicious behaviors. We implement a prototype of the framework and evaluate its performance through experiments.

References

[1] Cape sandbox book. https://capev2.readthedocs.io/en/latest/ installation/host/installation.html

[2] Fosshub. https://www.fosshub.com/categories.html.

[3] Malwarebazaar. https://bazaar.abuse.ch/.

[4] Microsoft sysinternals suite. https://learn.microsoft.com/en-us/sysint ernals/downloads/sysinternals-suite

[5] Nirsoft. https://www.nirsoft.net/.

[6] Liu, Yinhan, et al. "Roberta: A robustly optimized bert pretraining approach." arXiv preprint arXiv:1907.11692 (2019).

[7] Sourceforge. https://sourceforge.net/.

[8] William Arnold and Gerald Tesauro. Automatically generated win32 heuristic virus detection. In Proceedings of the 2000 international virus bulletin conference, 2000.

[9] Erick Bauman, Gbadebo Ayoade, and Zhiqiang Lin. A survey on hypervisor-based monitoring: approaches, applications, and evolutions. ACM Computing Surveys (CSUR), 48(1):1-33, 2015.

https://doi.org/10.1145/2775111

[10] Ulrich Bayer, Paolo Milani Comparetti, Clemens Hlauschek, Christopher Kruegel, and Engin Kirda. Scalable, behavior-based malware clustering. In NDSS, volume 9, pages 8-11, 2009.

[11] Philippe Beaucamps, Isabelle Gnaedig, and Jean-Yves Marion. Abstraction-based malware analysis using rewriting and model checking. In European Symposium on Research in Computer Security, pages 806-823. Springer, 2012.

https://doi.org/10.1007/978-3-642-33167-1_46

[12] Yoshua Bengio et al. Learning deep architectures for ai. Foundations and trends® in Machine Learning, 2(1):1-127, 2009.

https://doi.org/10.1561/2200000006

[13] Haniye Razeghi Borojerdi and Mahdi Abadi. Malhunter: Automatic generation of multiple behavioral signatures for polymorphic malware detection. In ICCKE 2013, pages 430-436. IEEE, 2013.

https://doi.org/10.1109/ICCKE.2013.6682867

[14] Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language models are fewshot learners. Advances in neural information processing systems, 33:1877-1901, 2020.

[15] Niusen Chen, Wen Xie, and Bo Chen. Combating the os-level malware in mobile devices by leveraging isolation and steganography. In Applied Cryptography and Network Security Workshops, 2021.

https://doi.org/10.1007/978-3-030-81645-2_23

[16] Mihai Christodorescu, Somesh Jha, Sanjit A Seshia, Dawn Song, and Randal E Bryant. Semantics-aware malware detection. In 2005 IEEE symposium on security and privacy (S&P'05), pages 32-46. IEEE, 2005.

https://doi.org/10.1109/SP.2005.20

[17] Malware statistics. https://controld.com/blog/malware-statistics-trends/

[18] George E Dahl, Jack W Stokes, Li Deng, and Dong Yu. Large-scale malware classification using random projections and neural networks. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pages 3422-3426. IEEE, 2013.

https://doi.org/10.1109/ICASSP.2013.6638293

[19] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers), pages 4171-4186, 2019.

https://doi.org/10.18653/v1/N19-1423

[20] Yoshiro Fukushima, Akihiro Sakai, Yoshiaki Hori, and Kouichi Sakurai. A behavior based malware detection scheme for avoiding false positive. In 2010 6th IEEE workshop on secure network protocols, pages 79-84. IEEE, 2010.

https://doi.org/10.1109/NPSEC.2010.5634444

[21] William Hardy, Lingwei Chen, Shifu Hou, Yanfang Ye, and Xin Li. D14md: A deep learning framework for intelligent malware detection. In Proceedings of the International Conference on Data Science (ICDATA), page 61. The Steering Committee of The World Congress in Computer Science, Computer ..., 2016.

[22] Greg Hoglund and James Butler. Rootkits: subverting the Windows kernel. Addison-Wesley Professional, 2006.

[23] Andreas Holzer, Johannes Kinder, and Helmut Veith. Using verification technology to specify and detect malware. In International Conference on Computer Aided Systems Theory, pages 497-504. Springer, 2007.

https://doi.org/10.1007/978-3-540-75867-9_63

[24] Kozak, Matous, et al. "Updating Windows malware detectors: Balancing robustness and regression against adversarial EXEmples." Computers & Security 155 (2025): 104466.

https://doi.org/10.1016/j.cose.2025.104466

[25] Johannes Kinder, Stefan Katzenbeisser, Christian Schallhart, and Helmut Veith. Proactive detection of computer worms using model checking. IEEE transactions on dependable and secure computing, 7(4):424-438, 2008.

https://doi.org/10.1109/TDSC.2008.74

[26] Jeremy Z Kolter and Marcus A Maloof. Learning to detect malicious executables in the wild. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 470-478, 2004.

https://doi.org/10.1145/1014052.1014105

[27] Infosecurity Magazine. Daily malicious files soar 3% in 2023. https: //www.infosecurity-magazine.com/news/daily-malicious-files-soar-3-2023/

[28] Smita Naval, Vijay Laxmi, Muttukrishnan Rajarajan, Manoj Singh Gaur, and Mauro Conti. Employing program semantics for malware detection. IEEE Transactions on Information Forensics and Security, 10(12):2591-2604, 2015.

https://doi.org/10.1109/TIFS.2015.2469253

[29] Joshua Ofoeda, Richard Boateng, and John Effah. Application programming interface (api) research: A review of the past to inform the future. International Journal of Enterprise Information Systems (IJEIS), 15(3):76-95, 2019.

https://doi.org/10.4018/IJEIS.2019070105

[30] Hamed Haddad Pajouh, Ali Dehghantanha, Raouf Khayami, and KimKwang Raymond Choo. Intelligent os x malware

threat detection with code inspection. Journal of Computer Virology and Hacking Techniques, 14(3):213-223, 2018.

https://doi.org/10.1007/s11416-017-0307-5

[31] Edward Raff, Jon Barker, Jared Sylvester, Robert Brandon, Bryan Catanzaro, and Charles Nicholas. Malware detection by eating a whole exe. arXiv preprint arXiv:1710.09435, 2017.

[32] Igor Santos, Felix Brezo, Xabier Ugarte-Pedrero, and Pablo G Bringas. Opcode sequences as representation of executables for data-mining-based unknown malware detection. information Sciences, 231:64-82, 2013.

https://doi.org/10.1016/j.ins.2011.08.020

[33] Joshua Saxe and Konstantin Berlin. Deep neural network based malware detection using two dimensional binary program features. In 2015 10th international conference on malicious and unwanted software (MALWARE), pages 11-20. IEEE, 2015.

https://doi.org/10.1109/MALWARE.2015.7413680

[34] M Zubair Shafiq, S Momina Tabish, Fauzan Mirza, and Muddassar Farooq. Pe-miner: Mining structural information to detect malicious executables in realtime. In International workshop on recent advances in intrusion detection, pages 121-141. Springer, 2009.

https://doi.org/10.1007/978-3-642-04342-0_7

[35] PV Shijo and AJPCS Salim. Integrated static and dynamic analysis for malware detection. Procedia Computer Science, 46:804-811, 2015.

https://doi.org/10.1016/j.procs.2015.02.149

[36] Spacelift. Malware statistics 2024. https://spacelift.io/blog/ malwar e-statistics

[37] StatCounter. Desktop operating system market share worldwide. https://gs.statcounter.com/os-market-share/ desktop/worldwide

[38] Hao Sun, Xiaofeng Wang, Rajkumar Buyya, and Jinshu Su. Cloudeyes: Cloud-based malware detection with reversible sketch for resource-constrained internet of things (iot) devices. Software: Practice and Experience, 47(3):421-441, 2017.

https://doi.org/10.1002/spe.2420

[39] Yong Tang, Bin Xiao, and Xicheng Lu. Using a bioinformatics approach to generate accurate exploit-based signatures for polymorphic worms. computers & security, 28(8):827-842, 2009.

https://doi.org/10.1016/j.cose.2009.06.003

[40] S Typel and G Baur. Theory of the trojan-horse method. Annals of physics, 305(2):228-265, 2003.

https://doi.org/10.1016/S0003-4916(03)00060-5

[41] Ram Mahesh Yadav. Effective analysis of malware detection in cloud computing. Computers & Security, 83:14-21, 2019.

https://doi.org/10.1016/j.cose.2018.12.005

[42] Yanfang Ye, Dingding Wang, Tao Li, Dongyi Ye, and Qingshan Jiang. An intelligent pe-malware detection system based on association mining. Journal in computer virology, 4(4):323-334, 2008.

https://doi.org/10.1007/s11416-008-0082-4

[43] Zhenlong Yuan, Yongqiang Lu, Zhaoguo Wang, and Yibo Xue. Droidsec: deep learning in android malware detection. In Proceedings of the 2014 ACM conference on SIGCOMM, pages 371-372, 2014.

https://doi.org/10.1145/2619239.2631434

[44] Mohamad Fadli Zolkipli and Aman Jantan. A framework for malware detection using combination technique and signature generation. In Computer Research and Development, International Conference on, pages 196-199. IEEE Computer Society, 2010.

https://doi.org/10.1109/ICCRD.2010.25

[45] Pearce, Hammond, et al. "Examining zero-shot vulnerability repair with large language models." 2023 IEEE Symposium on Security and Privacy (SP). IEEE, 2023.

https://doi.org/10.1109/SP46215.2023.10179420

[46] Feng, Ruirui, et al. "LLM-MalDetect: A Large Language Model-Based Method for Android Malware Detection." IEEE Access (2025).

https://doi.org/10.1109/ACCESS.2025.3565526

[47] Zhou, Ce, et al. "SRDC: Semantics-based Ransomware Detection and Classification with LLM-assisted Pre-training." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 39. No. 27. 2025.

https://doi.org/10.1609/aaai.v39i27.35080

[48] Casey, Eoghan. Digital evidence and computer crime: Forensic science, computers, and the internet. Academic press, 2011.

[49] NIST SP 800-53. https://csrc.nist.gov/pubs/sp/800/53/r5/ upd1/final

[50] Chen, Aokun, et al. "Contextualized medication information extraction using transformer-based deep learning architectures." Journal of biomedical informatics 142 (2023): 104370.

https://doi.org/10.1016/j.jbi.2023.104370

[51] TensorFlow. (2023). Fine-tune BERT on a downstream task. Retrieved from https://www.tensorflow.org/tfmodels/ nlp/fine_tune_bert

Downloads

Published

2025-12-17

Issue

Section

Articles