Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Docker configuration #497

Merged
merged 11 commits into from
Jul 28, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
.git
.github

venv
40 changes: 9 additions & 31 deletions .github/workflows/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,50 +7,28 @@ jobs:
runs-on: ubuntu-latest

steps:
- name: Check out repository code
- name: Checkout code
uses: actions/checkout@v2

- name: Set up Python 3.8
- name: Set up Python 3.8.11
uses: actions/setup-python@v2
with:
python-version: 3.8

- name: Restore cache
uses: actions/cache@v2
with:
path: .venv
key: ${{ runner.os }}-venv-${{ hashFiles('**/requirements*.txt') }}
restore-keys: |
${{ runner.os }}-venv-
python-version: 3.8.11

- name: Install dependencies
run: |
sudo apt install python3-dev postgresql libpq-dev build-essential libxml2-dev libxslt1-dev postgresql ncat
python -m pip install --upgrade pip
sudo apt-get install -y postgresql python3-dev libpq-dev build-essential
make dev envfile

- uses: syphar/restore-virtualenv@v1
id: cache-virtualenv
with:
requirement_files: requirements*.txt # this is optional
- name: Validate code format
run: make check

- uses: syphar/restore-pip-download-cache@v1
if: steps.cache-virtualenv.outputs.cache-hit != 'true'

- run: pip install -r requirements.txt -r requirements-dev.txt
if: steps.cache-virtualenv.outputs.cache-hit != 'true'

- name: Setup database
env:
PGPASSWORD: vulnerablecode
run: |
sudo systemctl start postgresql
sudo -Eu postgres psql -c "CREATE ROLE vulnerablecode WITH PASSWORD '$PGPASSWORD' NOSUPERUSER CREATEDB NOCREATEROLE INHERIT LOGIN;"
sudo systemctl status postgresql
createdb --encoding=utf-8 --owner=vulnerablecode --user=vulnerablecode \
--host=localhost --port=5432 vulnerablecode
make postgres

- name: Run tests
run: python -m pytest -v -m "not webtest"
run: make test
env:
DJANGO_DEV: 1
GH_TOKEN: 1
3 changes: 1 addition & 2 deletions .github/workflows/upstream_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -48,5 +48,4 @@ jobs:
POSTGRES_HOST: localhost
VC_DB_USER: postgres
POSTGRES_PORT: 5432
DJANGO_DEV: 1
GH_TOKEN: 1
GH_TOKEN: 1
30 changes: 0 additions & 30 deletions .travis.yml

This file was deleted.

20 changes: 8 additions & 12 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,15 +1,11 @@
FROM python@sha256:e9b7e3b4e9569808066c5901b8a9ad315a9f14ae8d3949ece22ae339fff2cad0
FROM python:3.8

# PYTHONUNBUFFERED=1 ensures that the python output is set straight
# to the terminal without buffering it first
# Force unbuffered stdout and stderr (i.e. they are flushed to terminal immediately)
ENV PYTHONUNBUFFERED 1
RUN mkdir /vulnerablecode
WORKDIR /vulnerablecode
ADD . /vulnerablecode/
RUN pip install -r requirements.txt && \
DJANGO_DEV=1 python manage.py collectstatic
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you remove collectstatic in the dockerfile?

For me it makes more sense to have the collectstatic in the base docker image rather to do it at every startup in the docker compose.

We have vulnerable code setup in kubernetes and this change did break it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tardyp sorry for the break.
@Hritik14 @tdruez do you have any insights?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No specific reason from my side, except that the same structure is being used in our other project.
https://github.com/nexB/scancode.io/blob/main/docker-compose.yml

If we revert, we should revert both.

Copy link
Contributor

@tardyp tardyp Aug 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see in docker-compose that you actully do the collectstatic in a shared volume so that the static is served by nginx.
I think this works well for this docker-compose design.

It won't work well in a kubernetes deployment as in kubernetes there is usually several instance of the webapp container which will compete doing the migration and doing the static collect.
In my kubernetes design I didn't bother serving the static files using a dedicated static server.

I think it make sense to ship the docker container with static files inside it, even if in the docker-compose you regenerate it. This will anyway allow the two strategies, with or without static file server.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, looks like django is not recommending serving static file by the app. I though this was only a performance best practice (which I don't bother with my 4 users), but they say it is also insecure.
https://docs.djangoproject.com/en/3.2/ref/contrib/staticfiles/#django.contrib.staticfiles.views.serve

So I think I will have to deploy an nginx as well..

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Staticfiles are only supposed to be served by a proper webserver or cdn. You could serve them via the development server using the --insecure flag, and it is as it sounds like - insecure.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see in docker-compose that you actully do the collectstatic in a shared volume so that the static is served by nginx.

I don't think we would have any trouble having the shared volume even if we invoke collectstatic inside the docker image. The best rationale I could think of is, because staticfiles is meant for production, it should be invoked in production.


LABEL "base_image": "pkg:docker/python@sha256%3Ae9b7e3b4e9569808066c5901b8a9ad315a9f14ae8d3949ece22ae339fff2cad0"
Hritik14 marked this conversation as resolved.
Show resolved Hide resolved
LABEL "dockerfile_url": "https://github.com/nexB/vulnerablecode/blob/develop/Dockerfile"
LABEL "homepage_url": "https://github.com/nexB/vulnerablecode"
LABEL "license": "Apache-2.0"
RUN mkdir /opt/vulnerablecode && \
mkdir -p /var/vulnerablecode/static/
WORKDIR /opt/vulnerablecode
COPY . .
RUN python -m pip install --upgrade pip && \
pip install -r requirements.txt
117 changes: 117 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
# SPDX-License-Identifier: Apache-2.0
#
# http://nexb.com and https://github.com/nexB/scancode.io
# The ScanCode.io software is licensed under the Apache License version 2.0.
# Data generated with ScanCode.io is provided as-is without warranties.
# ScanCode is a trademark of nexB Inc.
#
# You may not use this software except in compliance with the License.
# You may obtain a copy of the License at: http://apache.org/licenses/LICENSE-2.0
# Unless required by applicable law or agreed to in writing, software distributed
# under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR
# CONDITIONS OF ANY KIND, either express or implied. See the License for the
# specific language governing permissions and limitations under the License.
#
# Data Generated with ScanCode.io is provided on an "AS IS" BASIS, WITHOUT WARRANTIES
# OR CONDITIONS OF ANY KIND, either express or implied. No content created from
# ScanCode.io should be considered or used as legal advice. Consult an Attorney
# for any legal advice.
#
# ScanCode.io is a free software code scanning tool from nexB Inc. and others.
# Visit https://github.com/nexB/scancode.io for support and download.
# Modified for VulnerableCode use

# Python version can be specified with `$ PYTHON_EXE=python3.x make conf`
PYTHON_EXE?=python3
VENV=venv
ACTIVATE?=. ${VENV}/bin/activate;
VIRTUALENV_PYZ=etc/thirdparty/virtualenv.pyz
BLACK_ARGS=-l 100 .
# Do not depend on Python to generate the SECRET_KEY
GET_SECRET_KEY=`base64 /dev/urandom | head -c50`
# Customize with `$ make envfile ENV_FILE=/etc/vulnerablecode/.env`
ENV_FILE=.env
# Customize with `$ make postgres VULNERABLECODE_DB_PASSWORD=YOUR_PASSWORD`
VULNERABLECODE_DB_PASSWORD=vulnerablecode

# Use sudo for postgres, but only on Linux
UNAME := $(shell uname)
ifeq ($(UNAME), Linux)
SUDO_POSTGRES=sudo -u postgres
else
SUDO_POSTGRES=
endif

virtualenv:
@echo "-> Bootstrap the virtualenv with PYTHON_EXE=${PYTHON_EXE}"
@${PYTHON_EXE} ${VIRTUALENV_PYZ} --never-download --no-periodic-update ${VENV}

conf: virtualenv
@echo "-> Install dependencies"
@${ACTIVATE} pip install -r requirements.txt

dev: conf
@echo "-> Configure and install development dependencies"
@${ACTIVATE} pip install -r requirements-dev.txt

envfile:
@echo "-> Create the .env file and generate a secret key"
@if test -f ${ENV_FILE}; then echo ".env file exists already"; exit 1; fi
@mkdir -p $(shell dirname ${ENV_FILE}) && touch ${ENV_FILE}
@echo SECRET_KEY=\"${GET_SECRET_KEY}\" > ${ENV_FILE}

check:
@echo "-> Run black validation"
@${ACTIVATE} black --check ${BLACK_ARGS}

black:
@echo "-> Apply black code formatter"
${VENV}/bin/black ${BLACK_ARGS}

valid: black

clean:
@echo "-> Clean the Python env"
rm -rm ${VENV}

migrate:
@echo "-> Apply database migrations"
${ACTIVATE} ./manage.py migrate

postgres:
@echo "-> Configure PostgreSQL database"
@echo "-> Create database user 'vulnerablecode'"
${SUDO_POSTGRES} createuser --no-createrole --no-superuser --login --inherit --createdb vulnerablecode || true
${SUDO_POSTGRES} psql -c "alter user vulnerablecode with encrypted password '${VULNERABLECODE_DB_PASSWORD}';" || true
@echo "-> Drop 'vulnerablecode' database"
${SUDO_POSTGRES} dropdb vulnerablecode || true
@echo "-> Create 'vulnerablecode' database"
${SUDO_POSTGRES} createdb --encoding=utf-8 --owner=vulnerablecode vulnerablecode
@$(MAKE) migrate

sqlite:
@echo "-> Configure SQLite database"
@echo VULNERABLECODE_DB_ENGINE=\"django.db.backends.sqlite3\" >> ${ENV_FILE}
@echo VULNERABLECODE_DB_NAME=\"sqlite3.db\" >> ${ENV_FILE}
@$(MAKE) migrate

run:
${ACTIVATE} ./manage.py runserver

test:
@echo "-> Run the test suite"
${ACTIVATE} ${PYTHON_EXE} -m pytest -v -m "not webtest"

package: conf
@echo "-> Create a VulnerableCode package for offline installation"
@echo "-> Fetch dependencies in thirdparty/ for offline installation"
rm -rf thirdparty && mkdir thirdparty
${VENV}/bin/pip download -r requirements.txt --no-cache-dir --dest thirdparty
@echo "-> Create package in dist/ for offline installation"
${VENV}/bin/python setup.py sdist

install: virtualenv
@echo "-> Install and configure the Python env with base dependencies, offline"
${VENV}/bin/pip install --upgrade --no-index --no-cache-dir --find-links=thirdparty -e .

.PHONY: virtualenv conf dev envfile install check valid clean migrate postgres sqlite run test package
34 changes: 9 additions & 25 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -90,29 +90,10 @@ First clone the source code::
cd vulnerablecode




Using Docker Compose
~~~~~~~~~~~~~~~~~~~~

An easy way to set up VulnerableCode is with docker containers and docker
compose. For this you need to have the following installed.

- Docker Engine. Find instructions to install it
`here <https://docs.docker.com/get-docker/>`__
- Docker Compose. Find instructions to install it
`here <https://docs.docker.com/compose/install/#install-compose>`__

Use ``sudo docker-compose up`` to start VulnerableCode. Then access
VulnerableCode at http://localhost:8000/ or at http://127.0.0.1:8000/

**Important**: Don't forget to run ``sudo docker-compose up -d --no-deps --build web`` to sync your instance after every ``git pull``.


Use ``sudo docker-compose exec web bash`` to access the VulnerableCode
container. From here you can access ``manage.py`` and run management commands
to import data as specified below.
---------------------

Please find the docker documentation in `Docker Installation <docs/docker_installation.rst>`__

Without Docker Compose
~~~~~~~~~~~~~~~~~~~~~~
Expand Down Expand Up @@ -159,11 +140,13 @@ for this purpose::

SECRET_KEY=$(python -c "from django.core.management import utils; print(utils.get_random_secret_key())")

You will also need to setup the VC_ALLOWED_HOSTS environment variable to match the hostname where the app is deployed::
You will also need to setup the `ALLOWED_HOSTS` array inside `vulnerablecode/settings.py` according to
[django specifications](https://docs.djangoproject.com/en/3.2/ref/settings/#allowed-hosts). One example would be:
.. code-block:: python

VC_ALLOWED_HOSTS=vulnerablecode.your.domain.example.com
ALLOWED_HOSTS = ['vulnerablecode.your.domain.example.com']

You can specify several host by separating them with a colon `:`
You can specify several hosts by separating them with a comma (`,`)

Using Nix
~~~~~~~~~
Expand Down Expand Up @@ -213,6 +196,8 @@ Use these commands to run code style checks and the test suite::
python -m pytest


.. _Data import:

Data import
-----------

Expand Down Expand Up @@ -266,7 +251,6 @@ If you want to run the import periodically, you can use a systemd timer::

[Service]
Type=oneshot
Environment="DJANGO_DEV=1"
ExecStart=/path/to/venv/bin/python /path/to/vulnerablecode/manage.py import --all

$ cat ~/.config/systemd/user/vulnerablecode.timer
Expand Down
49 changes: 34 additions & 15 deletions docker-compose.yml
Original file line number Diff line number Diff line change
@@ -1,22 +1,41 @@
version: '3'

services:
web:
environment:
- DJANGO_DEV=1
- VC_DB_HOST=db
db:
image: postgres
env_file:
- docker.env
volumes:
- db_data:/var/lib/postgresql/data/

vulnerablecode:
build: .
command: bash -c "python manage.py migrate && python manage.py runserver 0.0.0.0:8000"
container_name: "vulnerablecode"
command: /bin/sh -c "
./manage.py migrate &&
./manage.py collectstatic --no-input --clear &&
gunicorn vulnerablecode.wsgi:application -u nobody -g nogroup --bind :8000 --timeout 600 --workers 2"
env_file:
- docker.env
volumes:
- .:/vulnerablecode
ports:
- "8000:8000"
- static:/var/vulnerablecode/static/
restart: on-failure
depends_on:
- db
db:
image: postgres
environment:
- POSTGRES_DB=vulnerablecode
- POSTGRES_USER=vulnerablecode
- POSTGRES_PASSWORD=vulnerablecode

nginx:
image: nginx
env_file:
- docker.env
volumes:
- static:/var/vulnerablecode/static/
- ./etc/nginx/templates/:/etc/nginx/templates/
ports:
- ${NGINX_PORT:-8000}:80
depends_on:
- vulnerablecode


volumes:
static:
db_data:

8 changes: 8 additions & 0 deletions docker.env
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
POSTGRES_DB=vulnerablecode
POSTGRES_USER=vulnerablecode
POSTGRES_PASSWORD=vulnerablecode

DJANGO_SETTINGS_MODULE=vulnerablecode.settings
VULNERABLECODE_DB_HOST=db

GUNICORN_SERVER=vulnerablecode
Loading