Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linked Docker containers for this data? #3

Open
psychemedia opened this issue Mar 14, 2017 · 6 comments
Open

Linked Docker containers for this data? #3

psychemedia opened this issue Mar 14, 2017 · 6 comments

Comments

@psychemedia
Copy link

Thanks for publishing these scripts. I was wondering if you also have Dockerfiles and docker-compose scripts that can build eg a MySQL database containing the data in one container, and then demonstrating R or python linked containers for analysing the data in the database?

@drkane
Copy link
Collaborator

drkane commented Mar 15, 2017

That's a very good idea. I'm not entirely sure how to accomplish it, but I think the Dockerfile would probably need something like the following commands to run it:

RUN python /import.py /data/RegPlusExtract_January_2017.zip
RUN mysql  -e "CREATE DATABASE ccextract;"
RUN mysql  ccextract < table-definition.sql
RUN mysqlimport --local --ignore-lines=1 --fields-terminated-by=, --fields-optionally-enclosed-by="\"" --fields-escaped-by=\ --verbose --delete ccextract data/extract_acct_submit.csv data/extract_aoo_ref.csv data/extract_ar_submit.csv data/extract_charity.csv data/extract_charity_aoo.csv data/extract_class.csv data/extract_class_ref.csv data/extract_financial.csv data/extract_main_charity.csv data/extract_name.csv data/extract_objects.csv data/extract_partb.csv data/extract_registration.csv data/extract_remove_ref.csv data/extract_trustee.csv

(Would need proper login details added to the mysql commands).

Have you got an example of a similar container?

@drkane
Copy link
Collaborator

drkane commented Mar 15, 2017

You might find this useful/interesting too: https://github.com/tithebarn/charity-base. Same data, but in MongoDB and with an API.

@psychemedia
Copy link
Author

Closest I've got is PostgreSQL and that's muddled up in https://github.com/psychemedia/ou-tm351/tree/master/TM351_docker_Testing

I really need to do a simple worked teaching example for my own reference/reuse!

@psychemedia
Copy link
Author

There are examples of getting node running in containers at eg https://nodejs.org/en/docs/guides/nodejs-docker-webapp/

@drkane
Copy link
Collaborator

drkane commented Mar 15, 2017

That's useful, thanks. I'd be happy to collaborate on a worked example - tbh I find docker and particularly docker-compose confusing and struggle to get my head around it sometimes!

@psychemedia
Copy link
Author

psychemedia commented Mar 15, 2017

I made a start but hit issues I don't have time to debug right now:-( (Most of my own reusable scripts are for postgres or mongo).

Init script for use by the Docker file - this breaks when the build process tries to run it (can't connect):

The chinit.sh file:

#!/bin/bash

echo "Starting up..."

#I'm not sure if we need to start and stop the db daemon?
#Couldn't offhand find a Hello World demo that just worked..!
#http://stackoverflow.com/a/29150538/454773
/usr/sbin/mysqld &
#until mysqladmin ping >/dev/null 2>&1; do
#  echo -n "."; sleep 0.2
#done
sleep 5

echo "Creating database..."
mysql  -e "CREATE DATABASE ccextract;"
echo "...done"

echo "Extracting..."
mysql  ccextract < /tmp/table-definition.sql
echo "...done"

echo "Importing..."
mysqlimport --local --ignore-lines=1 --fields-terminated-by=, --fields-optionally-enclosed-by="\"" --fields-escaped-by=\ --verbose --delete ccextract data/extract_acct_submit.csv data/extract_aoo_ref.csv data/extract_ar_submit.csv data/extract_charity.csv data/extract_charity_aoo.csv data/extract_class.csv data/extract_class_ref.csv data/extract_financial.csv data/extract_main_charity.csv data/extract_name.csv data/extract_objects.csv data/extract_partb.csv data/extract_registration.csv data/extract_remove_ref.csv data/extract_trustee.csv
echo "...done"

#Do we need to shut down any db daemon we started?
#mysqladmin shutdown
echo "...and done"

Dockerfile to build the MySQL container - this can go in the same directory as other files, along with the data zip file ...

FROM mysql/mysql-server

#We need to copy (ADD) the files we're using from the local directory into the container image
ADD import.py ./import.py
ADD bcp.py ./bcp.py

ADD RegPlusExtract_March_2017.zip data/RegPlusExtract_March_2017.zip
ADD table-definition.sql /tmp/table-definition.sql

RUN python ./import.py data/RegPlusExtract_March_2017.zip

#Anything in /docker-entrypoint-initdb.d/ should run as an init script?
ADD chinit.sh /docker-entrypoint-initdb.d/chinit.sh

#Not sure if we should just explicitly run things directly, at least in testing!
#i.e. comment out the /docker-entrypoint-initdb.d/ ADD and replace with:
#ADD chinit.sh /chinit.sh
#RUN /chinit.sh

The following docker-compose.yml can also go in the same folder - run it with docker-compose up --build -d and it will build the MySQL container from the local Dockerfile, pull down the Jupyter notebook image/container from Dockerhub, start the two containers and link them together:

#docker-compose up --build -d
mysqlcc:
  build: .
  environment:
    MYSQL_ROOT_PASSWORD: charitable
  expose:
    - "3306"
jupytercc:
  image: jupyter/scipy-notebook
  links:
    - mysqlcc
  ports:
    - "8888"
  volumes:
    - .:/home/jovyan/work

Running a python2 notebook, we should then be able to connect using something like the following:

##Use a py2 notebook (most straightforward mysql py package doesn't work in py 3?)
!conda install -y mysql-python
import sqlalchemy
engine = sqlalchemy.create_engine('mysql://root:charitable@mysqlcc:3306')
engine.execute("USE ccextract")

However, on one occasion I did get the db running (not sure how...) I got a disallowed connection message (connection incoming from a disliked IP address (the IP address of the Jupyter notebook). So no idea what's happening there either... (I'm a bit out of practice with this docker stuff; maybe my memories of it "just working" are all false memories!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants