I find myself dealing with a lot of data in various raw forms. I am slowly gathering all these scripts, cleaning them up and storing in this repos. Bits and pieces of each may be helpful. Many random python packages are used as well, which one may find useful when trying to extract data from a particular source.
- python
- python-pip
- virtualenv
Git the project local:
$ git clone https://github.com/mmonroe86/ETL_scripts.git
Create virtual environment in project:
$ virtualenv ETL_scripts
Activate the virtual environment:
$ source ETL_scripts/bin/activate
Install project python package requirements:
$ pip install -r requirements.txt