Machine-Learning-Based-Malware-Classification-based-on-API-call-Histogram

With the increase in the variety of malware samples, malware classification has become important. Static, dynamic, and hybrid features are extracted to classify the different malware samples into different malware families. In this work, we focus on classifying malware samples into 9 classes. The task aims to classify API call histogram input values into different malware classes. API call frequencies are the static representation of the behaviour of malware classes. The dataset was gathered from the International CyberSecurity Data Mining Competition (CDMC) 2022, provided by Paul Black. The primary approach to the problem consists of exploratory data analysis, training different machine learning models (Support Vector Machine, Multi-layer Perceptron, CatBoost, Random Forest) and comparing their training with a baseline model (Decision Tree). To evaluate the performance of the models on the test data (unlabelled), we use Cohen’s Kappa score. The score for our models shows the reliability of the models.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
CSI5388_Final_Report.pdf		CSI5388_Final_Report.pdf
Group 7 Project.pptx		Group 7 Project.pptx
Metric_scores.png		Metric_scores.png
README.md		README.md
ReadMe.txt		ReadMe.txt
all-phases.ipynb		all-phases.ipynb
baseline.ipynb		baseline.ipynb
eda-analysis.ipynb		eda-analysis.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Machine-Learning-Based-Malware-Classification-based-on-API-call-Histogram

About

Releases

Packages

Languages

abhishtjoshi/Machine-Learning-Based-Malware-Classification-based-on-API-call-Histogram

Folders and files

Latest commit

History

Repository files navigation

Machine-Learning-Based-Malware-Classification-based-on-API-call-Histogram

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages