Skip to content

Lesson: An Experiment with Paper Planes

One of the best ways for students to learn about data is to collect and enter data into a shared spreadsheet. This lesson (link to tutorial document) does exactly that.

On the surface, the lesson is about performing an experiment, reading and following an experimental protocol, and drawing conclusions from the data. Everything about the experiment is as simple as possible, including the measurement of the outcome of each trial.

But in the process of doing and analyzing the experiment, students need to enter their team’s data into a spreadsheet. The spreadsheet itself is shared by all the students in the class. That means that if any group enters their data incorrectly (e.g. not giving a number for flight distance, or using levels other than “yes” and “no” for the paperclip variable), everyone will have trouble reading the data.

You might think this is a bad thing. Why should one group’s blunder effect the others?

What’s makes it effective is that every group can see their own and every other group’s data. So when there is a mistake, anyone can step in to correct it.

This provides an opportunity to the instructor. Monitor the data as it’s being entered. Let people make their mistakes. But when you see one, draw the class’s attention to it and ask why the entry is an error. Some common errors you’ll see:

  • Different groups will measure distance in different units, e.g. feet or meters. Try to spot this so that the groups will have to discuss which unit they should all use in common.
    • Before students start to enter data, you might want to create a shared Google document containing a “code book” shared for all groups. Walk the class through writing the code book. Decide on how the school name should be identified (e.g. CCC) and list the group names. The code book is where to document what unit is being used for distance.
  • People will enter the flight distance not as a number but as a string of characters like 9.5 feet. Of course, the units are important, but they should not be in the data table, they should be in the code book.
  • Teams will be inconsistent in how they spell their team or school name. For instance, CofC, cofc, Cofc, despite their similarities to a human reader are, to the computer, three different levels of the categorical variable.

Toward the end of the lesson, students are asked to create an informative data graphic showing all the teams’ data. There are all sorts of possibilities here. Two that we like are:

Plane_data %>%
   gf_point(distance ~ team_name, color = ~paperclip,
             position = position_dodge(width = 0.3))


Plane_data %>%
 df_stats(distance ~ team_name + paperclip, mean, ci=ci.mean) %>%
 gf_pointrange(mean_distance + lower + upper ~ team_name,
               color = ~ paperclip, group = ~ paperclip,
               position = position_dodge(width = 0.2))

Please make comments

Please make comments below in the Leave a Reply box of things you liked about the lesson, things that went well or not well, what you tried, and things that could be different. Comments are visible to everyone who views this site. Your comments can help others implement the lessons and help the StatPREP team improve the lessons.

Feedback needed

In addition to the usual feedback we ask from instructors (Did it work well? What mistakes were there? What missed opportunities?), we ask for some special feedback from instructors about the shared spreadsheet.

You see, the same spreadsheet will be shared by all schools and all classes within those schools. Eventually, we hope, the activity will be popular enough that this simple sharing won’t work, and we’ll have to arrange to let each class or institution have it’s own spreadsheet. But for now, here are some ground rules:

  • The instructor should be the first to open the spreadsheet. (Use the link provided in the lesson.)
  • Check whether there are lots of people logged on to the spreadsheet. If so, another class may be using it and you’ll just have to accommodate them. The good side: more data! We don’t think this will happen very much, since at least at first not many classes will be using the lesson and it’s unlikely they will be doing so simultaneously.
  • If there’s no one, or very few people logged in, it’s reasonable for you the instructor to delete the data in the spreadsheet so that your class starts with a blank slate. This is especially the case if the data format has been messed up by another class. (We know you would never do this, and that you’ll be successful in getting your students to enter workable data, but not all classes are as talented as yours!)
  • The variable names should be:

  • flight_number: a number from 1 to 20 indicating where this particular throw came in the sequence of twenty throws.

  • paperclip: “yes” or “no” depending on whether the a paperclip was attached for the flight.
  • distance: a number giving the distance flown in feet.
  • team_name: some unique name given to identify which team’s flight this is, e.g. “Eagles”, “Seahawks”, “Bears”, “Daisies”, and so on.
  • school_name: an identifier for your school or your class section. All the teams who are working together should use exactly the same school_name.
2 Comments Post a comment
  1. Nicole Lang #

    1) Do the students need to be told to measure distance in feet? If so, it might be good to put it in the tutorial itself (especially if the activity is done outside of class time).
    2) In the tutorial, under “The Data”- I wasn’t sure I understood what the first 2 questions were getting at. Did the parenthetical info in question 2 actually answer that question?
    3) Instructors might want a heads-up about the confidence interval “foreshadowing” here. Most of us don’t talk about confidence intervals until later in the term and we do descriptive stats at the beginning.
    4) Maybe an old-fashioned personal preference, but I’d like to see a box plot. And something about standard deviation (even if the SD just shows up as part of the results of running some code).
    5) If this is used early in the term (as is likely), most students will not have interacted with R prior to this activity. It would be useful to provide a list of variable names from which the student can choose (under All the teams). They might also need a hint about what the “color=” piece is doing – or at least instructions to experiment with the “color=” part until they figure out what it does.
    At our institution about half of the Elementary Statistics students are taking online courses. For these students, it is helpful to provide more detailed instructions.

    February 11, 2018
  2. These are great observations! I’ve modified the lesson to incorporate these points. Take a look.

    I also modified the blog post itself. The blog post (you are looking at it right now!) is intended for instructors and contains some suggestions for how to run the lesson. Obviously, as we gain experience with the lessons, with your help we’ll be able to formulate better suggestions.

    February 16, 2018

Leave a Reply