Skip to content

EMBArk internals

BenediktMKuehne edited this page Nov 3, 2021 · 2 revisions

Technical Documentation

TODO table of contents

Modules

Django apps

embark folder in the root repo houses the django application

uploader


This app is responsible for uploading firmwares, running emba commands and saving results and metadata into SQL database

models.py

uploader app has following models

  • FirmwareFile: Stores path of firmware file on disk.
  • Firmware: This is an instance of emba run command. It has a linked FirmwareFile and corresponding flags for that emba run command. The path of emba_logs of this run is also inferred from this model. Custom fields provide additional attribute for defining expert field: CharFieldExpertMode, BooleanFieldExpertMode.
  • Result: This model stores the results after emba.sh aggregates them into a CSV file.
  • DeleteFirmware: Stores firmware model selected for deletion
  • ResourceTimestamp: Sores system details like cpu and memory usage at scheduled time intervals.
views.py
CRUD
  • save_file: POST request to upload file.
    POST with file
    /home/upload/<int:refreshed>/save_file
     
    
  • delete_file: Deletes already uploaded firmware files
    POST /home/delete/
    
Auth
  • logout_view: Logout the user
Rendering HTMLs for Frontend
  • login: Renders login page
    /
    
    Response: HTML('templates/uploader/login.html')
    
  • home: Renders home page
    GET /home/
    
    Response: HTML('templates/uploader/home.html')
    
  • service_dashboard: Renders the service dashboard on click of service dashboard in Main Menu.
    GET /home/serviceDashboard/
    
    Response: HTML('templates/uploader/embaServiceDashboard.html')
    
  • report_dashboard: Renders the report dashboard on click of reports option in Main Menu.
    GET /home/reportDashboard/
    
    Response: HTML('templates/uploader/reportDashboard.html')
    
  • individual_report_dashboard: Renders individual reports. Click on the Details button for any Firmware in Reports section
    GET /home/individualReportDashboard/<int:analyze_id>
    
    Response: HTML('templates/uploader/individualReportDashboard.html')
    
  • main_dashboard: Renders the main dashboard when user logs in
    GET /home/mainDashboard/
    
    Response: HTML('templates/uploader/mainDashboard.html')
    
Functional
  • download_zipped: Downloads the logs in zipped firmware for given emba run of a Firmware.
    GET /download_zipped/<int:analyze_id>/
    
  • start_analysis: Starts analysis of a firmware.
    POST /home/upload/<int:refreshed>/
    
  • get_load: CPU and Memory Usage.
    GET /get/load/
    
  • get_individual_report: Instance of Result model as dict for given emba run.
    GET /get_individual_report/<int:analyze_id>/
    
  • get_accumulated_reports: Report of all firwares analyzed till date.
    GET /get_accumulated_reports/
    
boundedExecutor.py
BoundedExecutor

The BoundedExecutor class is basically an extended wrapper for pythons ThreadPoolExecutor to support a selectable finite upper bound for running processes. Internally it uses additional BoundedSemaphore to track the current load, on submit() the semaphore is decremented and on finished process incremented.
On submit_firmware, the Archiver class is used to unpack the selected firmware to a new directory and also starting the emba.sh process with the blocking subprocess.call.
Methods are documented properly in embark/uploader/boundedExecutor.py

Unit tests labeled as test_boundedExecutor.py.

archiver.py
Archiver

The Archiver class is basically an extended wrapper for shutils to support all common archive formats.
It supports packing, unpacking and validating of different archive types.

Methods are documented properly in embark/uploader/archiver.py

Unit tests labeled as test_archiver.py.

forms.py

The current forms are generated from above defined models. The forms additionally take care of html attributes like expert mode and class on instancing.
See https://docs.djangoproject.com/en/3.2/topics/forms/modelforms/ for further information.

users


This app manages sign-in and sign-up functionalities.

models.py

users app has following models

  • User: Model to store user data login data
views.py
Auth
  • signup: For new user
    POST /signup
    Request: {
        "email": "",
        "password": "",
        "confirm_password": ""
      }
    
  • signin: For registered user
     POST /signin
     Request: {
        "email": "",
        "password": ""
      }
    

embark


This is the settings folder for django application. However in this project it also contains additional files.

consumers.py
WSConsumer

Extension of WebsocketConsumer class from django-channels to act as consumer of websockets.

  • This class is responsible for establishing a websocket connection with the frontend.
  • It also accesses redis database through CHANNEL_LAYER declared in settings.py in the same folder to send real time events to frontend.
  • These events are grouped by id of the Firmware model, emba.sh is running for.
logreader.py
LogReader
  • Our current server implementation creates a temporary empty log file and then waits in a blocking loop for new file changes on the emba.log via the inotify wrap.
  • Whenever the log file has changed, the difference between the emba.log file and the temporary log file is calculated via a python difflib script and passed to rxpy method for further processing.
  • After extracting the relevant information from the emba.log, a temporary message dictionary is updated and appended to a global dictionary which contains all messages for all running processes.
  • This message dictionary is pushed to the Redis Database to be consumed by WSConsumer.
  • The inotify reader class uses a python wrapper for the linux system call inotify(7). In our implementation, it adds a watch on the emba.log and sends events whenever the file changed. In this way we can trigger the next steps for live reading the emba.log.
  • More details can be found in embark/embark/logreader.py
routing.py

Just like urls.py a file containing routes for web-sockets to corresponding extentions of WebsocketConsumer class from django-channels

runapscheduler.py
runapscheduler command
  • started as command in the entrypoint.sh via python3 manage.py runapscheduler --test &, this runs in a seperate task
  • starts a logger, which sampels the system load in a predefined timespan and saves it into ResourceTimestamp model
  • also registrating a cleanup task to prevent excessive database usage
  • relying on apscheduler for adding tasks
  • flags:
    • <>: sample every hour - delete after 2 weeks \
    • --test: sample every second - delete after 5 minutes
  • For more details see the code embark/uploader/management/commands/runapscheduler.pyor the official django documentation: https://docs.djangoproject.com/en/3.2/howto/custom-management-commands/

Frontend

  • embaServiceDashboard.js: This script running on client side, takes the messages sent via websocket, to show live information about running emba processes in the backend. Currently, it displays the percentage, the current module as well as the current phase, the emba process is currently in. For each emba process it shows a container with this information labeled by the firmware name.

    - `Socket.onmessage`: After the socket connection is established and once the socket provides message this function helps in binding the messages to 
                          and creating the container.
    - `makeProgress()`  : Update the Progress bar with the percentage of progress made in Analyzing the Firmware.
    - `livelog_phase()` : Append new phase messages to the container
    - `livelog_module()`: Append new module messages to the container
    - `cancelLog()`     : Removes the container from the UI.
    
  • alertBox.js: This script provides alert functionalities to display success and error alerts to user. These functions can be used across the project when there is something to notify for the user.

    - `errorAlert()`    : Displays Alert message if something is failed.
    - `successAlert()`  : Display success message to user.
    - `promptAlert()`   : For any input which is required to be entered by user.  
    
  • accumulatedReport.js: This script generates reports from the data collected on analyzing the firmware . These reports will be displayed in the main dashboard.

    - `getRandomColors()`        : Get Random Colors for the Charts. 
    - `get_accumulated_reports()`: Gets Accumulated data of all firmware scans analyzed.
    - `promptAlert()`            : For any input which is required to be entered by user.  
    
  • fileUpload.js: This script allows the user to upload the firmware files and make post calls to backend to save files locally.

    - `showFiles()`: This function binds file name to div which will be displayed to user.
    - `postFiles()`: Makes Ajax call and save files locally.  
    
  • main.js: This script contains functionalities related to navigation menu and also contains the delete firmware functionality .

    - `expertModeOn()` : To toggle expert mode option during analysing the Firmware.
    - `confirmDelete()`: To show a window on confirmation screen asking the user to confirm deletion of firmware file.  
    
  • mainDashboard.js: This script generates the report of system load and also validates the login.

    - `check_login()`  : Validates the Login.
    - `get_load()`     : Get Load of Time , CPU and Memory Percentage.
    

Docker and docker-compose.

  • We use a single Dockerfile to take code from emba and embark folders and install all dependencies for our docker image.
  • We use docker-compose for orchestrating the entire architecture as a set of services.
    • auth-db: MySQl database.
    • emba: Container that houses the original emba repo behind web layer supported by embark django application.
    • redis: For redis database.
  • We run everything in Host mode and for now there is no dependency between containers mentioned in compose file. Although there is a dependency between emba service and auth-db service.
  • TODO: emba service depends on auth-db service. Add explicit depends_on directive in docker-compose.yml.
  • To make development faster we additionaly mount the entire embark folder as a volume in docker-compose.yml.
  • We open two ports for emba service. One for HTTP/1 connections and other for HTTP/2(Websockets)
  • entrypoint.sh: Shell script to run on container start.

uwsgi and asgi setup

uwsgi

django comes with inbuilt setup of uwsgi to deploy the application behin UWSGI web server for proper process and worker management.

asgi

For live data exchange between server and client, the Django framework provides us with websocket communication via asgi, which is also called Django Channels. For server side handling the framework has the consumers.py class in place, whereas for client side handling you just have to open a websocket connection, opening the IP and Port specified in the backend. As a result, multiple clients can connect to the backend. The url routing is declared in the routing.py file, which is the equivalent to the urls.py file for HTTP communication.

Running

We deploy UWSGI using the uWSGI library for python and asgi application using daphne server.

Database

MySQL DB

We use a MySql database as default db of our Django application. It is a part of docker-compose.yml in the root of our repo. This database serves the following purposes.

  1. Acts as house of all the models in the uploader and users apps.
  2. As a consequence of point 1, this database stores locations and respective commands for firmwares.
  3. It also acts as datastore of results after the processing.

docker-compose takes care of all the setup for you with some minimal config required in your .env file. Following variables are used to create and access databases in mysql container.

DATABASE_NAME=<Name you are going to give your db>
DATABASE_USER=root
DATABASE_PASSWORD=<value of MYSQL_ROOT_PASSWORD>
DATABASE_HOST=0.0.0.0(or host.docker.internal for windows)
DATABASE_PORT=3306
MYSQL_ROOT_PASSWORD=<This should be set>
MYSQL_DATABASE =<Same as DATABASE_NAME> 

Redis

We also use a Redis DB for caching intermediate results and events from various emba.sh runs. These events are not persisted permanently. Redis is mostly used as queue to store these events till they are pushed to be displayed on frontend through websockets.

docker-compose takes care of all the setup for you with some minimal config required in your .env file. Following variables are required to access redis container.

REDIS_HOST=0.0.0.0(or host.docker.internal for docker desktop)
REDIS_HOST=6379

Guidelines

testing

The project uses the django testing environment. To write your own unittest you need a python file labeled test_. Within there need to be class extending TestCase. On test execution all methodes in test classes are invoked an run.\

Existing test cases:

  • test_archiver.py: For testing Archiver class in embark/uploader/archiver.py
  • test_boundedExecutor.py: For testing BoundedExecutor class in embark/uploader/boundedExecutor.py
  • test_users.py: Tests for embark/users/views.py
  • test_logreader.py: Tests for embark/embark/logreader.py

There is Pipeline checking regression by running the django test environment: python manage.py test.
You are encouraged to run the tests locally beforehand.

logging and debugging

For logging use djangos logging environment.
Configuration can be found in embark/settings.py as LOGGING. Logs can be inspected at embark/logs/*.log

Usage (this will create the logfile 'web'):

logger = logging.getLogger('web')

[...]

logger.info("my very own log message")

For further reading see how to logging.

codestyle

Every contributor is obliged to the following coding style rules on EMBArk code:

  • shellcheck: Shell scripts should pass a shellcheck test with NO findings.
  • pycodestyle: To check your python conformity with pep8 locally: pycodestyle .
    To get further information about the violation run: pycodestyle --show-source .
    For further setting see pycodestyle documentation
  • pylint: Additionally we check fresh code with pylint. The code should pass pylint without warnings.
    pylint --max-line-length=240 --load-plugins pylint_django script.py
  • CodeQL: Finally your code needs to pass githubs CodeQL scanning.

All of the checks are done automatically on fresh pull requests. For testing offline you can use the check_project.sh script in EMBArk root directory.