Lesson: Driver reaction times
An all-too-common style of statistics exercise gives students some numbers and asks them to calculate a confidence interval or conduct a hypothesis test.
The StatPREP driver-reaction-time lesson is our re-interpretation using data and computing of a textbook-type exercise that asks for the confidence interval on the difference between these two sets of numbers:
The “context” given is that these are from experiment looking at the reaction times of drivers in making decisions based on different types of road signs. Each of the numbers in the table is described as the mean reaction time (in ms) over 20 trials.
Rather than giving a set of 20 numbers, the StatPREP lesson is based on data as the experimentalists might have collected it. There is a unit-of-observation (a driver in a trial). The drivers have identities, the order of each driver’s twenty trials is recorded as well as the actual measurement: that driver’s reaction time in that trial.
Starting with data helps students to assimilate the structure of the experiment and to expose students to good data habits. It also enables the student to explore graphical summaries of the data which can give additional insight. For instance, each drivers’ reaction times have a skew-right distribution, corresponding to the occasional extra-long reaction time, as might be caused by drifting attention.
Some discussion questions
Looking at the data
What’s the overall pattern seen in the data? Do permissive signs have a longer reaction time than prohibitive signs? What is happening in the few points to the right of the bulk of points for each driver? Do the drivers’ reaction times differ from one another? Do all the drivers have the same relative reaction to permissive vs prohibited signs.
The median reaction time
What is it about the data that suggested to the researchers to compare the median reaction times, rather than the means? This reduces the influence of the outlying long reaction times. But is that a good thing? Is it the typical reaction time or the long reaction times that might impact driving safety?
It can be helpful to get in the habit of challenging researcher’s choices. That’s possible when a lesson includes some actual choices, but not when the problem is reduced to calculating a p-value on some numbers.
Significant vs substantial
Translate the confidence interval on the difference of median reaction times into a difference in distance travelled by a car going 20 miles-per-hour. (You can use this information: A 100 ms difference in reaction time corresponds to a distance travelled of 3 feet.) Do you think that difference in distance has any practical importance, that is, in terms of traffic safety?
Weakness in the analysis
Assuming that it was appropriate to look at the median reaction time, is a two-sample t-test an appropriate technique? How would a paired t-test differ here?
In designing the lesson, we weren’t sure that instructors would want to focus on the difference between paired and unpaired t-tests. But if you choose to do so, all of the calculations can be carried out within the exercise chunks provided. For the paired t-test, a little data wrangling makes clear the difference from a two-sample test.
Reformatted <- Medians %>% spread(key = sign_type, value = median_react_time) %>% mutate(difference = perm - prohib) t.test( ~ difference, data = Reformatted)