Instructor: Jason M. Graham
Contact: Email: [email protected]
This course will run during the Fall 2018 semester at the University of Scranton. Content will be added to the repository as the course proceeds. For students enrolled in the course, addtional material such as the official class syllabus, readings, etc. will be posted on the learning management system (D2L) for the course.
The contents of this repository are
- a folder with files for a generic syllabus
- a folder with files for the guidelines for a course project
- a folder with code relevant to implementing and applying the data science workflow
To obtain the files from this repository, you can either clone the repo or download the files directly. To download, simply click the Clone or download button and then select Download zip. To clone the repo you must have git installed; information on this can be found below. With git installed, use the command (in your terminal or command window)
git clone https://github.com/jmgraham30/UoSDataSci.git
To update the repository, use the command
git pull origin master
Alternatively, you can use the github desktop app for accompishing either of these tasks.
For this course, we will use the R language. R can be downloaded for free here. In addition, we will use the RStudio integrated development environment (IDE) which can be downloaded here. It is also possible to run R and Rstudio virtually through the cloud through RStudio-cloud. This can be accomplished by going here. One very nice feature of RStudio-cloud is that it has a number of tutorials to help learn to use R listed under the primers tab.
A valuable reference for the course is the text R for Data Science by Grolemund and Wickham. An online version of this text can be found here.
In addition to R for Data Science, the books Data Science Design Manual and Modern Data Science with R are also very valuable resources.
It is recommended that you use the git version control system for tracking your code. If you do not already have git installed on your computer (MAC probably already installed, Windows probably not installed), it can be install from here. The github desktop app makes working with git very convenient; this can be installed here. To use github you must register for an account at the website. To learn more of the details on version control with git and github, see this article or the git book. A very nice introduction to version control with git and github that explains how these tools are integrated with RStudio can be found here.
It is also recommended that you have a reliable text editor installed. Atom is a good example and can be installed here.
In addition to R, you may want to have access to some of the Python data science tools. The best way to obtain these is through an Anaconda installation, which can be achieved by a download found here.