Carry out the statistical investigation process.
Program a simulation-based hypothesis test.
How well can humans distinguish one “Martian” letter from another? In today’s activity, we’ll find out. When shown the two Martian letters, Kiki and Bumba, write down whether you think Bumba is on the left or on the right.
The first step of any statistical investigation is to ask a research question. In this study the research question is: Can we as a class read Martian? (We will refine this later on!).
To answer any research question, we must design a study and collect data. For our question, the study consists of each student being presented with two Martian letters and asking which was Bumba. Your responses will become our observed data that we will explore.
Observational units or cases are the subjects data are collected on. In a spreadsheet of the data set, each row represents a single observational unit.
\(n\) = 13
A variable is information collected or measured on each observational unit or case. Each column in a data set will represent a different variable. Today we are only measuring one variable on each observational unit.
Once we have collected data, the next step is to summarize and visualize the data.
prop <- 9/13
## [1] 0.6923077
The proportion in question 6 is called a summary statistic—a single value that summarizes the data set. It is important to note that a variable is different than a summary statistic. A variable is measured on a single observational unit while a summary statistic is calculated from a group of observational units. For example, the variable “whether or not a student lives on campus” can be measured on each individual student. In a class of 50 students we can calculate the proportion of students who live on campus, the summary statistic. Look back and make sure you wrote the variable in question 4 as a variable, NOT a summary statistic.
Looking at the data set and the summary statistic is only one way to display the data. We will also want to create a visualization or picture of the data.
martian <- data.frame(
outcome = rep( c("correct", "incorrect"), c(9, 4) )
ggplot(data = martian, aes(x = outcome)) +
geom_bar(aes(y = ..prop.., group = 1), fill = "purple") +
labs(x = "Student Answer", y = "Frequency",
title = "Frequency of Class Guessing Correct or Incorrect",
subtitle = "Martian Alphabet")
martian_sum <- data.frame(
outcome = c("correct", "incorrect"),
proportion = c(9/13, 4/13)
ggplot(data = martian_sum, aes(x = outcome, y = proportion)) +
geom_col(fill = "purple") +
labs(x = "Student Answer", y = "Frequency",
title = "Proportion of Class Guessing Correct or Incorrect",
subtitle = "Martian Alphabet")
The next step is to use statistical analysis methods to draw inferences from the data. To answer the research question, we will simulate what could have happened in our class given random chance, repeat many times to understand the expected variability between different “randomly guessing” classes, then compare our class’s observed data to the simulation. This gives us an estimate of how often (or the probability of) the class’s result would occur if students were all merely guessing, allowing us to determine if the data provide evidence that we as a class can read Martian.
sample(c("correct", "incorrect"), size = 1, prob = c(0.5, 0.5))
## [1] "incorrect"
## [1] 6.5
x <- sample(c("correct", "incorrect"), size = 13, prob = c(0.5, 0.5), replace = TRUE)
data.frame(x) %>% group_by(x) %>% count()
## # A tibble: 2 × 2
## # Groups: x [2]
## x n
## <chr> <int>
## 1 correct 5
## 2 incorrect 8
mean(ifelse(x == "correct", 1, 0))
## [1] 0.3846154
mean(x == "correct")
## [1] 0.3846154
If students really don’t know Martian and are just guessing which
is Bumba, which seems more unusual: the result from your class’s
simulation in question 12 or the observed proportion of
students in your class that were correct (this is your summary statistic
from question 6)?
Explain your reasoning.
While your observed class data is likely far different from the simulated “just-guessing” class, comparing our class data to a single simulation does not provide enough information. The differences seen could just be due to the randomness of that set of coin flips! Let’s simulate another class. Run your code from question 12 again. What was the result from your class’s second simulation? What proportion of students “guessed” correctly in the second simulation? Create a plot to compare the two simulated results with the observed class result.
**We still only have a couple of simulations to compare our class data to. It would be much better to be able to see how our class compared to hundreds or thousands of “just-guessing” classes. Use a for loop in R to simulate 1,000 “just-guessing” classes and keep track of the proportion of each class that guessed correctly.
sim_props <- vector("numeric", 1000)
sim_props <- NULL
for(i in 1:1000){
x <- sample(c("correct", "incorrect"),
size = 13, prob = c(0.5, 0.5), replace = TRUE)
sim_props[i] <- mean(x == "correct")
data.frame(sim_props) %>% ggplot(aes(x = sim_props)) +
geom_dotplot(dotsize = 0.08)
## Bin width defaults to 1/30 of the range of the data. Pick better value with `binwidth`.
Based on your plot in question 14, is your class particularly good or bad at Martian? Explain.
Is it possible that we could see our class results just by chance if everyone was just guessing? Explain your reasoning.
Is it likely that we could see our class results just by chance if everyone was just guessing? Explain your reasoning.
A p-value is the probability of seeing the result in your data, or something “more extreme”, under the assumption of a null hypothesis. A null hypothesis is usually one of “no effect” or “just by chance” — an assumption under which it is relatively is easy to simulate data.
null hypothesis (\(H_0\)) - The class can’t read Martian. OR The class is just guessing. OR The true probability of choosing the correct answer is 0.5.
alternative hypothesis (\(H_a\)) - The class can read Martian. OR The class is NOT just guessing. OR The true probability of choosing the correct answer is greater than 0.5.
p-value - The probability of getting a proportion guessing correctly of 0.6923077 or greater, assuming the true proportion of guessing correctly is equal to 0.5.
We would only see results like we saw in our class in about p-value of all possible samples of 13 students, assuming we can’t read Martian.
The probability of at least 69% of students in our class guessing correctly, assuming we can’t read Martian, is p-value.
Use your results from question 13 to estimate the p-value for this analysis.
Write a function in R that will calculate an estimated p-value for this situation, where the function takes inputs: observed data, desired number of simulations.
p_value <- function(x, n, reps = 1000){
# x = number of correct guesses in observed data
# n = sample size
# reps = number of simulated classes
sim_props <- NULL
for(i in 1:reps){
my_samp <- sample(c("correct", "incorrect"),
size = n, prob = c(0.5, 0.5), replace = TRUE)
sim_props[i] <- mean(my_samp == "correct")
return( mean(sim_props >= x/n) )
p_value(9, 13, reps = 10000)
## [1] 0.133
to conduct
these simulation-based hypothesis tests. In question 20, you wrote a
function similar to the one_proportion_test()
function in
. This function can be found in the code here.
Find the one_proportion_test()
function in the code and
compare it to your function. What does
include that yours does not?The next step in the statistical investigation process is to communicate the results and answer the research question.
Reference for “Martian alphabet” is a TED talk given by
Vilayanur Ramachandran in 2007. The synesthesia part begins at roughly
17:30 minutes: