With the increase in the variety of malware samples, malware classification has become important. Static, dynamic, and hybrid features are extracted to classify the different malware samples into different malware families. In this work, we focus on classifying malware samples into 9 classes. The task aims to classify API call histogram input values into different malware classes. API call frequencies are the static representation of the behaviour of malware classes. The dataset was gathered from the International CyberSecurity Data Mining Competition (CDMC) 2022, provided by Paul Black. The primary approach to the problem consists of exploratory data analysis, training different machine learning models (Support Vector Machine, Multi-layer Perceptron, CatBoost, Random Forest) and comparing their training with a baseline model (Decision Tree). To evaluate the performance of the models on the test data (unlabelled), we use Cohen’s Kappa score. The score for our models shows the reliability of the models.
-
Notifications
You must be signed in to change notification settings - Fork 0
abhishtjoshi/Machine-Learning-Based-Malware-Classification-based-on-API-call-Histogram
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published