Skip to content

NICD-UK/project-template

Repository files navigation

Project Template

The project-template package can create a project sturcture and template scripts for a data science project. The package also provides tools to automate common data science tasks. The package has been developed to be used with Visual Studio Code for Python projects and RStudio for R projects.

Install

In the command line run:

pip3 install cookiecutter

Create

In the command line move to where you want to create the project directory and run:

python3 -m cookiecutter https://github.com/NICD-UK/project-template

You will be prompted for the:

  1. Project Name
  2. Project Directory Name
  3. Project Manager Name
  4. Project Manager Email
  5. Project Sponsor Name
  6. Project Sponsor Email
  7. Project Summary
  8. Project Language

In the command line run:

make

This command will:

  1. Initialise a virtual environment:
    • venv for Python
    • renv for R
  2. Install the packages required for the template scripts
  3. Save the packages to a dependencies file:
    • requirements.txt for Python
    • renv.lock for R
  4. Initialise a git repository

Usage

To install a package in Python run:

venv/bin/pip install <package>

To install a package in R use the Packages tab in RStudio.

To save packages to the dependencies file run:

make save

To load packages from the dependencies file run:

make load

Project Structure

The project has the following structure:

Makefile
README.md
data/
├─ clean/
├─ raw/
├─ wrangle/
models/
notebooks/
presentations/
reports/
src/
├─ 1-import/
├─ 2-clean/
├─ 3-wrangle/
├─ 4-model/

Template Scripts

There are template scripts for:

  1. transforming raw data into cleaned data in src/2-clean/,
  2. visualising cleaned data in src/2-clean/,
  3. transforming cleaned data into wrangled data in src/3-wrangle/,
  4. visualising wrangled data in src/3-wrangle

available in Python or R. Answer Python or R to the Language prompt during setup for the corresponding template scripts. All template transformation scripts include code to read data from and write data to the appropriate data directories. All template visualisation scripts include code to read data from the appropriate data directory and to generate a data report. There is also a template script for presenting data in presentations/ available in Quarto.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages