This is the code for the website of GiPHouse powered by Django.
Because every student needs to have an GitHub account to participate in the GiPHouse courses, authentication is based on GitHub OAuth. This removes the need to save separate usernames and passwords. Instead of usernames and passwords, the website uses GitHub IDs to authenticate users.
A custom user model (called Employee
) has been created to make this possible.
The website has single click login because it uses GitHub's OAuth implementation.
The GiPHouse GitHub organization has a GitHub App set up which is used for both repository synchronisation (explained later) and GitHub OAuth, which allows users to authorize the GiPHouse GitHub organization to see (limited) information about them. This App has credentials (a client ID and a client secret key), which are loaded into the Django settings using environment variables.
The authentication flow is as follows.
- A user requests to login or register.
- The user is redirected to
https://github.com/login/oauth/authorize
. - If the user has not authorized the GiPHouse GitHuB (OAuth) App, GitHub asks the user to do so.
- If the user has authorized the GiPHouse Github (OAuth) App, they are redirected to the GiPHouse website with a temporary code as GET parameter.
- The website uses this code to request an access token from the GitHub API.
- The website uses the access token to request information about the user.
To support this authentication flow, a custom authentication backend (called GithubOAuthBackend
) has been created.
Admin users (i.e. teacher, director or student assistant) do not have to register, they need to be created manually. In the backend it is possible to add new users. Every user should have their GitHub ID and GitHub username set. To promote a user to superuser, they should have the "Staff status", "Active" and "Superuser status" checkboxes checked.
The website makes the distinction between the spring semester (February - August) and the fall semester (September - January). Most of the content on the site is split by semester and year, because the content of the course that are given in the spring semester generally have nothing to do with the courses of the fall semester.
Every semester has a registration start and end date. Users will be able to between the start and end of the registration. They can do this by going to https://giphouse.nl/register/.
The registration process consists of two steps:
- Authorizing the GiPHouse GitHub (OAuth) App.
- A form that asks question about the registration (e.g. email, student number and project preferences).
The project preference is required to fill in during the registration, but registrations can be deleted later if necessary if for instance a project which people gave as preference is being discontinued.
If multiple semesters have open registrations at one moment, the chronologically newest semester is picked. For example, if the fall 2019 and spring 2020 semester both allow registrations at one moment, users will register for the spring 2020 semester (even if at the moment of registration it is officially fall 2019).
It is possible for users to register multiple times. For example, if a user wants to follow a course in the fall semester and a different course in the spring semester. However, users cannot register multiple times within the same semester.
There is no built-in support for directors (which will look like registrations that are not appointed to a project).
Admin users can generate a suggested team assignment. This suggested team assignment is generated by solving a linear integer optimization and constraint satisfaction problem using Google OR-Tools.
The constraint satisfaction problem is based on the nurse scheduling problem. It contains constraints that ensure that:
- Everyone is assigned to exactly one project.
- Managers and engineers are distributed equally over all projects.
- All projects contain at least one dutch speaking manager. This constraint is left out if there are fewer dutch speaking managers than there are projects.
In addition to these constraints it contains objectives that are tried to be met, but not guaranteed:
- Employees should be in one of their preferred projects. The first project preference is preferred over the second.
- Employees should be in the same projects as their partner preferences. The order of the partner preferences does not matter for this.
- Projects should contain employees with varied programming experience.
During the courses, the students need to fill out surveys about the course, their project progress and their team. Admin users are able to create questionnaires and view submission by students in the backend.
The questions in the questionnaires are of the following types:
- Poor/good likert scale,
- Disagree/agree likert scale and
- Open question.
It is also possible to ask a question multiple times about each teammate.
Questionnaires have a soft deadline and a hard deadline, which allows students to submit their answers late (i.e. after the soft deadline but before the hard deadline).
The room reservation is built using FullCalendar, a popular JavaScript Calendar. FullCalendar allows users to drag rooms that they want to reserve into timeslots. FullCalendar makes requests to the website to save changes in the database. The rooms are created in the backend by admin users.
Admin users can add information about the course lectures and the projects in the backend. There are also a small amount of static HTML webpages with information about GiPHouse.
The projects module provides synchronisation functionality with a GitHub organization using the GitHub API v3. For this, a repository model is included in Django. Project(team)s can have one or multiple repositories, which are then synchronised with GitHub. For this functionality, a GitHub App must be registered and installed in the organization. Details on this are explained later.
Projects and repositories contain a field github_team_id
and github_repo_id
that corresponds to the respective id
of the object on GitHub. These fields are automatically set and should not be touched under normal circumstances. Teams and repositories on GitHub that do not match one of these id's will not be touched by the GitHub synchronization.
If the github_team_id
or github_repo_id
are None
, it is assumed the objects do not exist and new objects will be created on synchronization (except for archived projects and teams).
Repositories and project(team)s are synchronized with GitHub in the following manner:
- For each project, a GitHub Team is created in the organization.
- Employees of a project are added to the team as "member"s.
- All users (employee or not) that do not belong in a team, are removed from both the team and the organization.
- Organization owners will never be removed from the organization, only from the team.
- Teams that are manually created in GitHub and not linked in Django, are ignored and will not be removed.
- Each repository is created on GitHub in the organization.
- Depending on the environment variables, either a private or public repository is created.
- The associated team is given "admin" access to the repository.
- Other additional permissions of a repository stay untouched.
- Repositories can be archived.
- Repositories that are marked as 'To be archived' will be archived on GitHub during the next sync. After they have been archived on GitHub, they are marked as 'Archived' and can only be unarchived manually via github.com.
- A project is considered archived if all of its repositories are archived.
- If a project is archived, the associated GitHub Team will be removed and consequently, all employees will be removed from the organization (again, organization owners are ignored).
Synchronization can only be initialized via actions on specific sets of objects in their changelists, or via the big 'synchronize to GitHub' button (to perform synchronization on all objects) in the admin. Synchronization is implemented in a idempotent manner.
Synchronization currently does not regard the role of directors of GipHouse. This needs to be configured manually. Note that it is however not possible to add directors manually to a team on GitHub, since they will be removed after each sync.
The projects module provides synchronisation functionality with AWS Organizations using the official boto3 Python AWS SDK. The AWS synchronisation process only applies to the current semester and is one-directional (from GiPHouse to AWS, but not vice versa).
Each project in the current semester with a team mailing list gets its own AWS member account that is part of GiPHouse's AWS organization. Since all AWS member accounts have isolated environments, each team is able to configure their own AWS environment as desired. The AWS member accounts are restricted in their abilities using a pre-configured SCP policy that is applied to the course semester Organizational Unit (OU) where all team member accounts reside. For example, the SCP policy can be set such that only (certain types of) EC2 instances may be launched. Such specific configuration details can be found under the Getting Started section.
The entire AWS synchronization process, also referred to as the pipeline, can be initiated in the Django admin interface under Projects by pressing the large SYNCHRONIZE PROJECTS OF THE CURRENT SEMESTER TO AWS
at the top-right and roughly goes through the following stages:
- Preliminary checks
- Pipeline preconditions
- Locatable boto3 credentials and successful AWS API connection
- Check allowed AWS API actions based on IAM policy of caller
- Existing organization for AWS API caller
- AWS API caller acts under same account ID as organization's management account ID
- SCP policy type feature enabled for organization
- Edge case checks
- No duplicate course semester OU names
- Pipeline preconditions
- Create current course semester OU (if non-existent)
- Attach SCP policy to current course semester OU (if non-existent)
- Synchronization
- Determine new accounts to be invited based on AWS and GiPHouse data.
- Create new AWS member accounts in AWS organization
- Move new AWS member accounts to course semester OU
After the synchronization process has finished, a response box is returned indicating success (green), soft-fail (orange) or hard-fail (red).
Verbose details for each synchronization run is logged using the logging
module and can be accessed in the backend. for example to inspect causes of failed runs.
An example of a possible AWS Organizations environment in the form a tree is the following:
base (root/OU)
│
├── Fall 2022 (OU)
│ ├── [email protected] (member account)
│ └── [email protected] (member account)
│
├── Spring 2023 (OU)
│ ├── [email protected] (member account)
│ └── [email protected] (member account)
│
└── [email protected] (management account)
The "base" (either root or OU), under which all relevant resources are created and operated on as part of the synchronization process, offers flexibility by being configurable in the Django admin panel.
When an AWS member account has been created for a team mailing list as part of an AWS Organization, an e-mail is sent by AWS. This process might take some time and is under AWS' control. It is important to be aware that gaining initial access to the member account is only possible by formally resetting the password; there is no other way. Also note well that each project team member will receive such mails because the team mailing list works as a one-to-many mail forwarder.
By default, all newly created member accounts under an AWS organization are placed under root.
Once the member accounts have been created under root, they are automatically moved to the current course semester OU.
Note that: (1) it is not possible to create a new member account that gets placed in a specific OU and (2) new requested member accounts can not be moved unless the account creation has been finalized to SUCCESS
and AWS does not specify an upper bound for the time it takes for a new member account creation to finalize.
Due to point (2), the code contains the variables ACCOUNT_REQUEST_MAX_ATTEMPTS
for the number of times to check the status of a new member account request, and ACCOUNT_REQUEST_INTERVAL_SECONDS
for the time to wait in between attempts.
These values are currently hard-coded and can be tweaked, should they cause problems with the synchronization process.
Points (1) and (2) pose the possibility of there being a time period between having a newly created member account under root and moving it to its corresponding OU that is restricted with an attached SCP policy, possibly giving the member account excessive permissions. To mitigate this risk, every newly created account comes with a pre-defined tag and the SCP policy attached to root should deny all permissions for accounts under root with the specific tag (see Getting Started section for more details on SCP policy and tag configuration). The tag then automatically gets removed after the account has been moved to its destination course semester OU.
Admin users can create mailing lists using the Django admin interface. A mailing list can be connected to projects, users and 'extra' email addresses that are not tied to a user. Relating a mailing list to a project implicitly makes the members of that project a member of the mailing list. Removing a mailing list in the Django admin will result in the corresponding mailing list to be archived or deleted in G suite during the next synchronization, respecting the 'archive instead of delete' property of the deleted mailing list. To sync a mailing list with G Suite, one can run the management command: ./manage.py sync_mailing_list
or use the button in the model admin. This will sync all mailing lists and the automatic lists into G Suite at the specified domain.
This sync starts by creating groups in G Suite for all mailing lists currently not in there, after they are created a request is done per member of that group to add them to the group. For the already existing groups a list is made of existing members in the group and the needed inserts or deletes are done to update the group.
A task is a process that takes more time than can fit in a request. The process is run in a separate thread and the status is synced to the task. The task is then used to show the user the progress and redirect them when it is finished.
Bootstrap and Font Awesome are used to style the website. Their respective SCSS versions are used.
The website (and Django) have features to make developing and testing changes to the code locally easy.
The website has multiple settings. By default, the development settings are used.
Follow the following steps to setup your own personal development environment.
- Install Python 3.8+ and poetry.
- Clone this repository.
- Run
poetry install
to install all dependencies into virtual environment. - Run
poetry shell
to enter the virtual environment. - Run
python website/manage.py migrate
to initialize the database. - Run
python website/manage.py runserver
to start the local testing server. - Run
python website/manage.py runserver
again, to make sure the server discovers the just created/static/
files.
Because the authentication is based on Github OAuth authentication, some setup is required for users to be able to login in their own development environment.
You will need to set up your own GitHub App that we will use for OAuth (and for repository synchronisation as well, as explained in the next step) and set your client ID and client secret key as environment variables (DJANGO_GITHUB_CLIENT_ID
and DJANGO_GITHUB_CLIENT_SECRET
respectively). direnv is a tool that allows Linux users to do this automatically.
You will then be able to create a new superuser with the createsuperuser
management command.
$ python website/manage.py createsuperuser --github_id=<your_github_id> --github_username=<your_github_username> --no-input
To enable the synchronisation functionality of repositories and project(team)s, a GitHub App must be registered and installed in an organization. This GitHub App needs the following permissions:
- Repository/Metadata: read-only
- Repository/Administration: read and write
- Organization/Members: read and write
- User/Email addresses: read-only
We assume you already have a GitHub organization setup.
-
As an organization, you can develop a GitHub app and register this app at GitHub. People can then install that app in their own account or organization, giving your app access to that account or organization. For this project, you will need to first create your own GitHub app and then install it in your organization. After this, you can find a
DJANGO_GITHUB_SYNC_APP_ID
, and download the RSADJANGO_GITHUB_SYNC_APP_PRIVATE_KEY
. To use this key, encode itbase64
toDJANGO_GITHUB_SYNC_APP_PRIVATE_KEY_BASE64
. These will be used as environment variables in this project and need to be set as GitHub Actions secrets in the repository (which will be explained later). -
After the app is created, it needs to be installed in your own organization (although technically speaking, it is also possible to publish the app in the previous step and install the app in a different organization!). On installation, you can find the
DJANGO_GITHUB_SYNC_APP_INSTALLATION_ID
which we also need to set in this project. This installation id is hidden in the overview of installed GitHub Apps in your organization. Additionally, you need to set theDJANGO_GITHUB_SYNC_ORGANIZATION_NAME
to the name of the organization the app is installed in.
To enable the synchronisation feature of mailing lists to G Suite, a project and service account need to be setup.
- Create a project in the google cloud console.
- Create a service account and credentials.
- Enable domain wide delegation of authority, with the scopes:
https://www.googleapis.com/auth/admin.directory.group
(for accessing groups and adding or deleting members)https://www.googleapis.com/auth/apps.groups.settings
(for changing settings of groups and adding aliases)
- If this is the first time using the Groups Settings API, enable it for the project.
The credentials and admin user can then be setup in Github secrets. The email of the G Suite user used to manage to the G Suite domain has to be stored in the Github secret DJANGO_GSUITE_ADMIN_USER
. The credentials json file has to be base64
encoded and stored in the Github secret DJANGO_GSUITE_ADMIN_CREDENTIALS_BASE64
(you can use the linux command base64
for encoding the json file).
To enable the AWS synchronisation feature, the following points need to be configured only once in advance:
- Create AWS Organizations with all features enabled.
- Ensure Service Control Policies (SCPs) feature is enabled.
- Enable AWS CloudTrail for logging account activity (optional, recommended).
- Increase AWS Organizations quota for maximum number of member accounts to expected amount.
- Default quota is set to 10.
- Expected amount should be at least the number of unique projects in the current semester.
- Set AWS API credentials for
boto3
as environment variables.AWS_ACCESS_KEY_ID
: access key for AWS account.AWS_SECRET_ACCESS_KEY
: secret key for AWS account.- (!) Currently not automated using GitHub secrets.
- AWS API caller has sufficient permissions for all synchronization actions.
- AWS API caller is IAM user acting on behalf of the management account of the AWS Organizations.
- Ensure you are logged in as the management account when creating the IAM user for API access.
- Pre-defined IAM policies
AWSOrganizationsFullAccess
andIAMFullAccess
are more than sufficient. - (!) Restrictive custom IAM policy adhering to the principle of least privilege is recommended.
- AWS API caller is IAM user acting on behalf of the management account of the AWS Organizations.
- Create SCP policies under AWS Organizations.
- SCP policy restricting member accounts under root with a custom key-value tag.
- Manually attach policy to root.
- SCP policy for course semester OUs (e.g. to only allow EC2 resources of a specific type).
- Automatically attached to course semester OUs.
- SCP policy restricting member accounts under root with a custom key-value tag.
- Configure a current AWS Policy under
Projects/AWS Policies
in the Django admin panel.- Set the base ID (root or OU) value.
- Set the SCP policy ID value for course semester OUs.
- Set the restricting custom key-value tag specified in the root SCP policy.
- Mark the
Is current policy
checkbox to make the configuration active.
The Python dependencies are managed using a tool called Poetry, which automatically creates virtual environments that ease development and makes it easy to manage the dependencies. See the Poetry documentation for more information.
To make testing in your development environment easier, the management command createfixtures
exists. This management command creates dummy data in your local database.
$ python website/manage.py createfixtures
Use the --help
argument to get more information.
To make sure everything functions correctly, the website has tests (using Djangos built-in test framework). To run these manually use
$ python website/manage.py test website/
The code of this project has high standards. This is enforced by continuous integration (GitHub Actions).
- PEP 8 (Python styling) is enforced by using Black and flake8.
black
is a tool that formats Python code. It is meant to be run on Python code and always return the same well-formatted code. This removes the need of manually formatting Python code, because if a formatting mistake is made, runningblack
on the project will fix the mistake.
- Test coverage (both statement and branch coverage) is enforced by using Coverage.py.
- The website has 100% test coverage.
- PEP 257 (Python docstring conventions) is enforced by
pydocstyle
.- PEP 257 forces every function, method and class to have documentation.
Django is only a web framework it cannot run without a web server and a database. Django needs a WSGI server to run the Python code. We use uWSGI
. uWSGI
needs a webserver that handles the web traffic, for this we use NGINX
(which also handles all the static files). In development a simple SQLite3 database is used, because it is easy to setup and easy to delete. In production a more robust solution is necessary, that is why Postgres is used as production database.
The website is meant to be run as a Docker container on a Linux server. The use of Docker containers, allows us to create a tailor-made environment (a Docker image) that has all the dependencies (i.e. files, libraries and packages) pre-installed.
The Dockerfile which steps should be executed to create the Docker image. This Docker image has all dependencies and the source code of the website. It also specifies that the production settings should be used.
The entrypoint (/usr/local/bin/entrypoint.sh
inside the Docker image) is executed whenever a container is created from the Docker image, it waits for the Postgres database to come up, makes the necessary changes to the static files and the database, creates a superuser (if it does not exist yet) and starts uWSGI
.
A Docker image of the website (docker.pkg.github.com/giphouse/website/production) is available on GitHub Packages. This image is refreshed every time a change is merged into the master
branch.
docker-compose
is a tool that allows us to run multiple Docker containers that are connected to make them work together. Please see the docker-compose.yaml
file for the exact settings. This file contains all the configuration to pass the correct environment variables to the containers and save the correct files to the host.
The following services are created by docker-compose
.
nginx-proxy
is a Docker image containing that automatically generates correct NGINX
configuration for other Docker Images. It listens on port 80 (HTTP) and 443 (HTTPS) and acts as the main access point for all web traffic to the website.
letsencrypt-nginx-proxy-companion
is a Docker image that uses Let's Encrypt to request a TLS certificate to make the website available over HTTPS.`
Runs a Postgres Database.
giphouse/giphousewebsite
is the Docker image that runs the actual website using uWSGI
as server.
Whenever a change is merged into the master
branch, the deploy.yaml
GitHub Actions workflow is run. This workflow does the following:
- Build a Docker image using the
master
branch. - Push the Docker image to GitHub Packages.
After the build-docker
job is finished, the deploy
job runs.
- Sets up SSH key that allows the runner to login to the production server using SSH.
- Fills in passwords and secrets in the necessary files.
- Create the necessary directories and files on the production server.
- Pulls the new Docker image.
- Restarts the production website with the new Docker image.
- Purge all old unused Docker images.
This repository is public and the GitHub Actions CI runner logs are also public, but some deployment information is secret. The secret information is setup using GitHub repository secrets. These are passed as environment variables to the GitHub Actions CI runners. The following secrets are setup.
DJANGO_SECRET_KEY
: TheSECRET_KEY
for Django.DJANGO_GITHUB_SYNC_APP_ID
: The App ID of the registered GitHub App installed in the GiPHouse organization.DJANGO_GITHUB_SYNC_APP_PRIVATE_KEY_BASE64
: The private RSA key (PEM formatted) of the registered GitHub App installed in the GiPHouse organization,base64
encoded.DJANGO_GITHUB_SYNC_APP_INSTALLATION_ID
: The Installation ID of the registered GitHub App installed in the GiPHouse organization.DJANGO_GITHUB_SYNC_ORGANIZATION_NAME
: The name of the organization the registered GitHub App is installed in.DJANGO_GITHUB_CLIENT_ID
: The GiPHouse organization GitHub (OAuth) App client ID.DJANGO_GITHUB_CLIENT_SECRET
: The GiPHouse organization GitHub (OAuth) App client secret key.DJANGO_GITHUB_SYNC_SUPERUSER_ID
: The Github ID of the initial superuser.DJANGO_GSUITE_ADMIN_USER
: The user which the G Suite api will impersonate when logging in with the credentials.DJANGO_GSUITE_ADMIN_CREDENTIALS_BASE64
: The G Suite service account key file in json format, thenbase64
encoded.POSTGRES_NAME
: The name of the Postgres database.POSTGRES_USER
: The username that is used to interact with the Postgres database.POSTGRES_PASSWORD
: The password that is used to interact with the Postgres database.SSH_USER
: The user that the CI runner uses to login to the production server using SSH. This user should be allowed to use Docker (e.g. be a member of thedocker
group).SSH_PRIVATE_KEY
: The private key that allows theSSH_USER
user to login to the production server using SSH.
The current server is an Amazon Web Services Elastic Cloud Computing (AWS EC2) instance that runs Ubuntu 18.04. EC2 instances have a default ubuntu
user, that is allowed to execute sudo
without password. The docker-compose.yaml
file includes all services that are necessary to run the website in a production environment. That is why Docker is the only dependency on the host.
These steps are the necessary setup for a production server.
- Add the SSH public keys of engineers to the
authorized_keys
of theubuntu
user. - Disable SSH password login.
- Install
docker
anddocker-compose
. - Add the
ubuntu
user to thedocker
group. - Add the public key of the
SSH_PRIVATE_KEY
GitHub secret to theauthorized_keys
file of theSSH_USER
GitHub secret user. - Change the url of the website by changing the
DEPLOYMENT_HOST
env variable in the deploy.yaml file. This will set the host in all necessary places.
All moving parts should be regularly updated to make sure all code is up to date and secure. There is no process in place to automate updates, because that may break something. The following
- The Ubuntu server should be updated.
- The Python dependencies should be updated (through
poetry
). - The JavaScript, font and [S]CSS files should be updated by replacing the current files with new ones.