Skip to content
Ratin Kumar edited this page Aug 31, 2020 · 2 revisions

Introduction

This document will detail the approach and changes introduced by the database support in Ganga. The new features introduced with database support are:

  1. Database Backend for Ganga CLI.
  2. Database Backend for future expansion of Ganga for Centralized Deployment.

Changes Introduced by Database Support:

Change in LifeCycle of a GangaProcess:

  1. Using a controller from container_controllers.py a database instance is started.
  2. GangaRepositoryDatabase._read_master_cache() loads index information of the master jobs at once.
  3. Upon request for a jobGangaRepositoryDatabase.load() is called to load the jobjson from database.
  4. Upon quiting, GangaRepositoryDatabase.index_write() saves the updated indexes for all the jobs.

DStreamer.py: Json Object Loader and Dumper functions:

Classes:

[JsonRepresentation](https://github.com/ganga-devs/ganga/wiki/Ganga-Database):

Class used by GangaObjects for to_json() method. Implementation of to_json() method is of 3 types:

  1. Objects.py: This is the base implementation of the to_json function for all GangaObjects. Any object can have the either of 3 objects or attributes attached to it, following is the way to convert them to json:
    1. If simple attribute (eg, j.name = "foo"). The conversion is as simple as adding the key:value pair in to the object dict: {"name": "foo"}.
    2. if component attribute (GangaObject like):
      1. GangaList: Since a GangaList can have more GangaLists in it. GangaLists have a seperate implementation of the this method to_json(). So when a GangaList is found, the case comes down to: {"attribute_name": attribute_value.to_json()}.
      2. GangaObject which is not a GangaList: Convertion of a GangaObject to json makes use of the JsonDumper's object_to_json.
  2. Job.py: Just calls to_json() method of the its attributes.
  3. GangaList.py: Iterates through all the elements of the GangaList and returns its items. If any item is a GangaList, it recurses and gets all the elements from that item.

[JsonLoader](https://github.com/ganga-devs/ganga/wiki/Ganga-Database):

Class used for loading GangaObjects from jsons obtained from database retrievals. For each GangaObject type found in the json representation, an Empty Object of the type is created. All simple attributes are assigned to the empty object and incase of component attributes Empty Objects for those are created in recursion and assigned to their owner/master/parent.

Functions:

  1. object_to_database():
    1. Converts the any GangaObject to corresponding json by calling the internal to_json() method (for more information on the method check: TBA).
    2. The json representation is updated in the database, if already exists else inserted into the database.
    3. Raises DatabaseError, when the insertion fails.
  2. object_from_database():
    1. Searches for the data in the database using the information from _filter param.
    2. If the object exists in the database, the json representation is converted in to a GangaObject If object is not found raises DatabaseError.
  3. index_to_database():
    1. Save cache/index information of jobs into the database.
    2. Can store single or multiple files at once.
    3. If insertion fails or returns empty set, DatabaseError is raised.
  4. index_from_database():
    1. Retrieves index information for the object identified by the _filter param.
    2. If the retrieval set is empty, raises DatabaseError.

container_controllers.py: Container/Database Service Controllers:

This file consists of implementations for 4 database controllers, where each controller is responsible for starting/shutting down the database service (except in the case of a native db):

  1. Native: The database is installed locally. In this case, the controller is a blank implementation as in most cases ganga would not have the rights to start/close the database service running on the system.
  2. Docker
  3. uDocker
  4. Singularity

Each controller implements two actions:

  • start: The action responsible for starting of the container. Any controller will implement this action in the following steps:

    1. Check if the container matched with required specs

        `name: getConfig('DatabaseConfigurations')['containerName']`     
      

      exists in the container service registry.

      1.1 If does, start the container using the required command.

      1.2 If it does not, create the container using the required command.

  • quit : The action responsible for stopping of the container. Any controller will implement this action in the following steps:

    1. Check if the container is still running

      1.1 If running, shutdown the container (and also delete the container if required)

      1.2 if not running, skip.