Each week, Tuesdays will be primarily lecture and class discussion or
short activities, and Thursdays will be lab days.
- You should bring your laptop with you to every class
period.
- Assigned readings for the week should be completed prior to lab.
Whether you learn better by reading prior to lecture, or hearing a
lecture prior to reading is up to you.
- Weekly labs are due in GitHub by 5pm on the due date (typically
Wednesdays).
- Weekly homework assignments are due in D2L by 5pm on the due date
(typically Wednesdays).
Week 1 (Jan 19–21): Course Overview
Weekly Overview:
- Discuss course structure and expectations.
- Install R and RStudio.
- Provide a brief introduction to R.
Reading and Online Resources:
- Intro to R:
- R Installation Resources:
- R Markdown:
In-class Materials:
Week 2 (Jan 24–28): Version Control with Git
Weekly Overview:
- Create a Github account.
- Install Git.
- Connect RStudio to Git and GitHub.
- Provide a brief introduction to version control with Git.
Reading and Online Resources:
In-class Materials:
In-class Lab:
- Lab 1 - Due Monday,
Jan 31 by 5pm in GitHub
Homework:
Note: In order for RStudio to compile to a PDF document, you
need some version of TeX installed on your system. An easy way to
install the TeX required for R Markdown compilation to PDF files is
through the tinytex
package.
Week 3 (Jan 31–Feb 4): Data Visualization with ggplot2
Weekly Overview:
- Understand R data structures and basic “base R” graphics functions
(leftover from last week).
- Describe the “grammar of graphics”.
- Visualize data with the
ggplot2
package.
Reading and Online Resources:
In-class Materials:
In-class Lab:
- Lab 2 - Due
Monday, Feb 7 by 5pm in GitHub
Homework:
Week 5 (Feb 14–18): R Overview and Style
Weekly Overview:
- Write code using tidyverse style guidelines
- Practice debugging code
- Write your own R functions
Reading and Online Resources:
- R Style:
- Workflow and organization in R:
In-class Materials:
In-class Lab:
- Lab 4 - Due
Tuesday, Feb 22 by 5pm in GitHub
Homework:
Week 6 (Feb 21–25): Functions, Loops and Debugging
Weekly Overview:
- Write your own R functions (cont)
- Use for/while loops in simulation
- Debug using built-in tools
Reading and Online Resources:
In-class Materials:
Midterm Exam 1:
- Midterm Exam 1 will cover material from Weeks 1–4:
- Overview of R and RStudio
- Version control with Git and GitHub
- Data visualization with
ggplot2
- Data wrangling with
tidyverse
(specifically,
dplyr
)
- Labs 1–4 and homeworks 1–3
- In-class component on Thursday, Feb 24
- Take-home component due in D2L Monday, Feb 28 by 5:00pm
- Examples of old
exams
Week 7 (Feb 28–Mar 4): Data Wrangling 1 – Tidy Data and Relational
Data
Weekly Overview:
- Implement merge and join procedures to wrangle multiple data
sets
- Transform data from wide to long format
Reading and Online Resources:
In-class Materials:
In-class Lab:
- Lab 5 - Due Monday,
Mar 7 by 5pm in GitHub
Homework:
Week 8 (Mar 7–11): Data Wrangling 2 – Strings, Factors,
Date/Time
Weekly Overview:
- Continue to learn data wrangling techniques
- Manipulate character strings
- Manipulate time/date objects
- Manipulate factor objects
Reading and Online Resources:
In-class Materials:
Homework:
Week 9 (Mar 21–25): Web Scraping
Weekly Overview:
- Understand the basic structure of HTML
- Use the
rvest
package to scrape data from the web
- Explore using functions and iteration applied to web scraping
Reading and Online Resources:
In-class Materials:
In-class Lab:
- Lab 6 - Due Wednesday,
Mar 30 by 5pm in GitHub
Homework:
- No homework this week due to a longer Lab 6.
Week 10 (Mar 28–Apr 1): Data Visualization Principles + R Shiny
Dashboards
Weekly Overview:
- Understand and implement principles of data visualization
- Develop a beginner understanding of R Shiny
- Introduce data visualization project and select data set
Reading and Online Resources:
In-class Materials:
In-class Lab:
- Lab 7 - Due Wednesday,
Apr 6 by 5pm in GitHub
Homework:
Week 11 (Apr 4–8): Modeling Uncertainty
Weekly Overview:
- Explore sampling variability through simulation.
- Carry out bootstrapping techniques to estimate sampling
variability.
Reading and Online Resources:
In-class Materials:
In-class Lab:
- Lab 8 - Due
Wednesday, Apr 13 by 5pm in GitHub
Homework:
Project:
By April 8th at 11:00 p.m., the following must be completed and
available in your Github repo:
- Topic you are interested in researching (in README.md)
- Data source (in README.md)
- Two research questions (in README.md)
- R script that scrapes a data set of interest (scrape.R)
- Scraped and cleaned data set (in the data folder)
Week 12 (Apr 11–15): Review
Weekly Overview:
- Review and practice important concepts in statistical computing and
visualization using R.
In-class Materials:
Midterm Exam 2:
- Midterm Exam 2 will cover material from Weeks 5–10
- In-class component on Thursday, Apr 14
- Take-home component due in D2L (Rmd) and Gradescope (pdf) Tuesday,
Apr 19 by 12:00pm (noon)
- Examples of old
exams
Week 13 (Apr 18–22): Project Week
Weekly Overview:
- Continue working on project RShiny dashboard
Reading and Online Resources:
In-class Materials:
Project:
By April 22th at 11:00 p.m., the following must be completed:
- “Data Summary & Visualization” portion of RShiny dashboard R
code completed in GitHub
- RShiny app deployed to https://www.shinyapps.io/
- Link to RShiny app posted at the bottom of README.md in Github
repo
Optional content on SAS:
Based on the student survey given this week, we will cover predictive
modeling, classification, and clustering instead of SAS for the next two
weeks. If you would still like to explore SAS, we have a SAS On Demand
course set up for you where you can practice with SAS. Here are some
additional resources:
Accessing SAS:
SAS video resources:
Dr. Hoegh’s SAS videos and slides:
Week 14 (Apr 25–29): Predictive Modeling, Classification, and
Clustering
Weekly Overview:
- Introduce (review) predictive modeling methods
- Explore classification methods (supervised learning)
- Explore clustering methods (unsupervised learning)
Reading and Online Resources:
For this week and next, we will use Modern Data Science with
R, 2nd ed., by Baumer, Kaplan and Horton as our reference. We
will touch on topics in Chapters 10–12:
In-class Materials:
Videos:
In-class Lab:
- Lab 9 -
Due Wednesday, May 4 by 5pm in GitHub
Homework:
Project:
By April 29th at 11:00 p.m., the following must be completed:
- Complete RShiny dashboard (including both “Data Summary &
Visualization” and “Discussions”) R code completed in GitHub
- Link to RShiny app posted in D2L Project Shiny Apps discussion
board
Week 15 (May 2–6): Predictive Modeling, Classification, and
Clustering
Weekly Overview:
- Continue exploration of predictive modeling, classification, and
clustering
- Review for final exam
Reading and Online Resources:
In-class Materials:
Project:
By May 6th at 11:00 p.m., the following must be completed:
- Post comments on at least two RShiny app discussion
posts. Each of your posts should include (a) at least one feature you
think works well, (b) at least one feature that you might have done
differently, and (c) how you could extend the study (e.g., what other
research questions does the analysis inspire?).
- Complete project group evaluation in D2L.
Final Exam
Take Home portion (optional): Released Friday, May
6th at 8:00am. Due Monday, May 9 by 11:00pm
In Class portion: Tuesday, May 10 12:00–1:50pm
Your final exam grade will be the higher of the following
scores:
- In-class final exam score.
- Weighted average of your take-home and in-class final exam scores,
weighted as 40% in-class, 60% take-home.