Skip to content

abhishtjoshi/Machine-Learning-Based-Malware-Classification-based-on-API-call-Histogram

Repository files navigation

Machine-Learning-Based-Malware-Classification-based-on-API-call-Histogram

With the increase in the variety of malware samples, malware classification has become important. Static, dynamic, and hybrid features are extracted to classify the different malware samples into different malware families. In this work, we focus on classifying malware samples into 9 classes. The task aims to classify API call histogram input values into different malware classes. API call frequencies are the static representation of the behaviour of malware classes. The dataset was gathered from the International CyberSecurity Data Mining Competition (CDMC) 2022, provided by Paul Black. The primary approach to the problem consists of exploratory data analysis, training different machine learning models (Support Vector Machine, Multi-layer Perceptron, CatBoost, Random Forest) and comparing their training with a baseline model (Decision Tree). To evaluate the performance of the models on the test data (unlabelled), we use Cohen’s Kappa score. The score for our models shows the reliability of the models.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published