Join our chatroom to keep appraised of updates, info, and general Q&A:
The goal of the course is to get students familiar with the process of reading, manipulating, and visualizing data. The course will be taught primarily in R, but will touch on related topics such as R markdown, the "grammar of graphics", Shiny, and Git.
Github Repo for the course: https://github.com/Open-Data-Science-at-SIO/Intro-Data-Viz-Winter-2017
All participants will be expected to follow the SIO Open Data Science Code of Conduct: https://open-data-science-at-sio.github.io/mission.html
Note that this applies both to the physical space for classes, as well as online interactions in the chatroom, mailing list, and Github repository.
Students should have some familiarity with programming and/or R (e.g. past experience programming in R for an introductory stats course). A short introductory course in R (e.g. https://www.datacamp.com/courses/free-introduction-to-r) will also suffice.
Students who plan to attend should install R (https://cran.r-project.org/), RStudio (https://www.rstudio.com/products/rstudio/download/), and Git (https://git-scm.com/). While RStudio is not strictly necessary for this course, it will ensure a standard user interface for students to follow along.
Students should also create a GitHub account (https://github.com/).
Class meets every Thursday 1pm - 2:30pm in Hubbs Hall 4500 (unless otherwise noted).
Each class will be 30-45 60 min. of guided code demos, followed by 30-45 30 min of Q&A / interactive lab sessions.
Students are highly encouraged to bring laptops to class to follow along.
- January 12 (Week 1)
- Course Logistics
- Basic Git and Github
- Overview of R data types (numeric, factor, string, date & time, binary, etc.)
- Overview of R data structures (array, list, matrix, data frames, etc.)
- January 19 (Week 2)
- RStudio interface setup
- Installing R packages
- Basic R markdown (
rmarkdown
andknitr
) - Reading and writing data from files & databases
- Basic data wrangling
- Conversion between wide and long formats
- Data validation
- January 26 (Week 3)
- The "grammar of graphics" (
ggplot2
) & layer system - Basic ggplot geoms and plots (scatterplot, histogram, bars, lines)
- The "grammar of graphics" (
- February 2 (Week 4)
- Changing colors in ggplot
- The theme layer in ggplot
- Custom color palettes (
viridis
,RColorBrewer
,spaceMovie
) - Adding summary statistics in plots
- February 9 (Week 5)
- Advanced ggplot geoms and plots
- Various plot tweaks (coordinate transformations)
- Multi-panel plots
- February 16 (Week 6)
- Advanced data wrangling (
dplyr
andtidyr
) - subsetting, summarizing, transformations, merging datasets
- Advanced data wrangling (
- February 23 (Week 7)
lapply
(base R) andmap
(purrr
) functions- R markdown chunk options (
eval
,include
,cache
)
- March 2 (Week 8)
- 3d plots (
rgl
) Animation (gganimate
)
- 3d plots (
- March 9 (Week 9)
- Interactive web apps (
shiny
)
- Interactive web apps (
- March 16 (Week 10 / Finals)
- TBD (unassigned - catch-up week / guest speaker / advanced topic)
- Simple introduction to Git that explains the jargon and various use cases -- https://speakerdeck.com/alicebartlett/git-for-humans
- A quick guide to making new repositories on Github and associating them with a new RStudio project -- http://happygitwithr.com/rstudio-git-github.html
- RStudio tips and tricks -- https://rawgit.com/kevinushey/2017-rstudio-conf/master/slides.html#1
- R Markdown basics -- http://rmarkdown.rstudio.com/
- Bibliographies and citations in R Markdown -- http://rmarkdown.rstudio.com/authoring_bibliographies_and_citations.html
- Advanced R Markdown -- https://slides.yihui.name/2017-rstudio-conf-rmarkdown-Yihui-Xie.html#1
- Sample code for various tasks in R -- http://www.cookbook-r.com/
- RStudio cheatsheets - https://www.rstudio.com/resources/cheatsheets/
- Basis for week 3 notes: Harvard tutorial -- http://tutorials.iq.harvard.edu/R/Rgraphics/Rgraphics.html
- Basis for week 3 notes: Hadley slides -- http://ggplot2.org/resources/2007-vanderbilt.pdf
- ggplot2 book -- http://roger.ucsd.edu/record=b6914994~S9
- Hadley slides on ggplot2 motivation and examples -- http://ggplot2.org/resources/2007-past-present-future.pdf
- Argument against ggplot -- http://simplystatistics.org/2016/02/11/why-i-dont-use-ggplot2/
- Response to above, pro-ggplot -- http://varianceexplained.org/r/why-I-use-ggplot2/
- Reasons to use ggplot system -- https://mandymejia.wordpress.com/2013/11/13/10-reasons-to-switch-to-ggplot-7/
- theme layer documentation in ggplot -- http://docs.ggplot2.org/dev/vignettes/themes.html
spaceMovie
package for Star Wars palettes -- https://github.com/butterflyology/spaceMovieviridis
package for color palettes -- https://cran.r-project.org/web/packages/viridis/vignettes/intro-to-viridis.html- Info about different color scales in ggplot -- http://www.cookbook-r.com/Graphs/Colors_(ggplot2)/
- Cheatsheet about color palettes -- https://www.nceas.ucsb.edu/~frazier/RSpatialGuides/colorPaletteCheatsheet.pdf
- ggplot gallery -- http://www.r-graph-gallery.com/portfolio/ggplot2-package/
- extensions to ggplot (other packages) -- http://www.ggplot2-exts.org/gallery/
- various tweaks to prettify a ggplot figure -- http://zevross.com/blog/2014/08/04/beautiful-plotting-in-r-a-ggplot2-cheatsheet-3/
- Tidy Data -- http://vita.had.co.nz/papers/tidy-data.pdf
- Data Wrangling cheatsheet -- https://github.com/rstudio/cheatsheets/raw/master/source/pdfs/data-transformation-cheatsheet.pdf
- dplyr vignette -- https://cran.rstudio.com/web/packages/dplyr/vignettes/introduction.html
- "R for Data Science" on map functions -- http://r4ds.had.co.nz/iteration.html#the-map-functions
purrr
GitHub repo -- https://github.com/hadley/purrr- Code chunk guide (basic) -- http://rmarkdown.rstudio.com/authoring_rcodechunks.html
- R markdown guide -- http://kbroman.org/knitr_knutshell/pages/Rmarkdown.html
knitr
chunk options -- https://yihui.name/knitr/options/
[1] http://scs.math.yorku.ca/index.php/MATH_6627_2012-13_Practicum_in_Statistical_Consulting/R_tutorials/rgl_tutorial [2] http://www.sthda.com/english/wiki/a-complete-guide-to-3d-visualization-device-system-in-r-r-software-and-data-visualization#setup-the-environment [3] https://www.r-bloggers.com/creating-3d-geographical-plots-in-r-using-rgl/ [4] https://cran.r-project.org/web/packages/rgl/vignettes/rgl.html [5] http://brazenly.blogspot.com/2016/08/r-graphics-tutorial-series-part-3.html [6] A package that builds upon rgl: Ocean View -- vignette; additional links