A Python malware analysis library. Mostly for machine learning purposes.
You can download the PDF of my dissertation at:
https://scholar.dsu.edu/theses/326/
Recommended citation (this helps me see if my work is used in other places):
Jones, Keith, "Malgazer: An Automated Malware Classifier With Running Window Entropy and Machine Learning" (2019). Masters Theses & Doctoral Dissertations. 326.
https://scholar.dsu.edu/theses/326
The slides for the dissertation can be viewed at:
You can find logs from different training sessions in the training folder.
You can access all the training data I used at:
Please file any bugs or issues using the GitHub issues facility above.
The "master" branch is for users. The bleeding edge, and often broken, branch of "develop" is for new features.
This source code supports my dissertation. The code is not production ready until that time. Be aware that this code will change often as I add more functionality. There will be frequent breaking changes.
To run the Docker portion of this project, you will need a trained classifier that will predict classifications with the "predict_sample" function, such as the library.ml.ML class. Dill Pickle this classifier (or use train_classifier.py and the resulting saved classifier output from this script) and place it in samples/ml.dill.
Next, copy ".env.template" to ".env" and fill in any information for your instance.
Next, you can stand up this project with the following command after you have installed Docker on your system:
docker-compose up
You can rebuild all of the docker images at any time with the following command:
docker-compose build --no-cache
This was developed using Docker on a Mac. Other operating systems have not been tested (yet).
You can start a local registry with:
docker run -d -p 5000:5000 --restart=always --name registry registry:2
After bringing it up in Docker, you can access the web portion of this project at https://localhost. Information about the API is on the "API" page of the website.
After bringing it up in Docker, you can access the API portion of this project at https://localhost/api
After bringing it up in Docker, you can access portainer at https://localhost/portainer
Logs can be found in docker/logs in a directory for each node in the docker stack.
To use this module outside Docker, you will need the requirement. The following command installs the requirements:
pip install -r requirements.txt
If you have trouble with the FFTW library on a Mac, install it via HomeBrew and pass the directory into pyleargist:
# brew install fftw
...
# export LIBRARY_PATH=/usr/local/Cellar/fftw/3.3.8/lib/
If you are running Windows or macOS, please make sure the dependencies for python-magic are installed. More information can be found at https://github.com/ahupp/python-magic.
This application(s) is/are covered by the Creative Commons BY-SA license.
- https://creativecommons.org/licenses/by-sa/4.0/
- https://creativecommons.org/licenses/by-sa/4.0/legalcode
- Magic File
- PE File Structures
- ELF File Structures
- Mach-O File Structures