This repository provides materials for a student-run session on how to use to handle pipelines with the package targets that is part of the IDS Tools for Data Science workshop 2024 run at the Hertie School, Berlin in October 2024. The student-run workshop is part of the course Introduction to Data Science taught by Simon Munzert at the Hertie School, Berlin, in Fall 2024.
This session will introduce you to the package targets and how it can help you organise your pipeline. The pipeline is the process that takes in raw data and outputs a result. The pipeline is also called a recipe.
Data wrangling is one of the core steps in the data science workflow. dplyr is a grammar of data manipulation, providing a consistent set of verbs that help you solve the most common data manipulation challenges, including the manipulation of datasets and variables.
The goals of this session are to (1) introduce the difference between managing your pipeline with and without the targets package, (2) show you the key functions of the package, (3) explain the structure of files and folders needed to run the package, and (4) provide practice material for your reference.
The session is accompanied by a tutorial, which can be accessed here.
- Laia Domenech Burin
- Hanna Fantahun Getachew
- Chloe Fung
The material in this repository is made available under the MIT license.