Use this .Rmd file as a template for your homework. Please use D2L to turn in both the Knitted PDF output and your R Markdown file. Your .Rmd file should compile on its own if it is downloaded by your instructor.
library(tidyverse)
The data we will use for this homework contain single bike trips from March of 2017 for the Capital BikeShare system in Washington, D.C.
bikes <- read_csv("https://math.montana.edu/shancock/data/biketrips2017.csv")
Notice: This is a large file, so it takes RStudio a few minutes to import the data. In order to avoid waiting for this data to load each time you knit the document, I have added the cache option to the R chunk above that reads in the data.
Use the str()
function to summarize the data set. What
does each column represent? What about each row?
Describe the difference between substr()
and
strsplit()
.
Use one of the functions described in Exercise 2 to create a new
variable for the hour a bike trip began. Then use the
count()
function to compute the number of trips starting at
each hour.
Now, instead of using strings, we’ll use the lubridate
package to extract parts of dates and times. Examine the
lubridate
cheatsheet here.
What function from this package could you use to extract the day of the
month?
Use the function you chose in Exercise 4 to create a new variable
called day
that extracts the day of the month. Then use
this variable to compute how many trips were made for each of the 31
days in the month of March. (The cases that started and ended on
different days have been already filtered out of the data set.)
Create a new variable that contains the trip time.
What percentage of bike rentals last more than an hour?
Create a figure to plot the trip time as a function of day of week (Sunday, Monday, etc…). Make sure to include an informative title and sort the weekday into chronological ordering. Write a few sentences describing the key features of the plot.