Modeling the World with Least Squares Regression (CoRD)

Joan Brenneman, University of Central Oklahoma
Kristi Karber, University of Central Oklahoma
 

The overarching goal of this CoRD is to increase students’ ability to critically evaluate cause-and-effect claims encountered in daily life. Research has shown that participants inferred a causal relationship between variables based on a statement with limited context and that only indicated a covariational relationship (Gershman & Ullman, 2023). Moreover, it has been demonstrated that college students often have difficulty distinguishing between studies that show causation versus those that show a correlation (Seifert et al., 2022). The latter article emphasizes the importance of fostering critical thinking skills given the prevalence of data-driven claims they will encounter throughout their lives.

The activities in this CoRD use data from countries to explore least squares regression. Students will begin with an activity to help prevent the formation of biases that may occur when utilizing this data. They will then create and interpret scatterplots to determine the form, direction, and strength of relationships between two quantitative variables. Moreover, they will determine when it is appropriate to utilize the correlation coefficient and the least squares regression line by analyzing the covariational relationship of the data observed in the corresponding scatterplot. Lastly, students will examine correlation versus causation and the potential role of lurking variables.

Note: The data collected by the students in Activity 1 will be utilized throughout the remaining activities. Moreover, students will be creating a PowerPoint presentation regarding information that they collect and analyze. Activities 1-4 will direct them on the creation of new slides for this presentation. A sample rubric is provided in Activity 4 if the instructor decides to have the students present their findings to the class.

References:

Gershman, S. J., & Ullman, T. D. (2023). Causal implicatures from correlational statements. PLOS ONE, 18(5). https://journals.plos.org/plosone/article?id=10.1371%2Fjournal.pone.0286067

Seifert, C. M., Harrington, M., Michal, A. L., & Shah, P. (2022). Causal theory error in college students’ understanding of science studies. Cognitive research: principles and implications, 7(1), 4. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8755867/

 

Introduction Activity

Worksheet

Worksheet Answers

The series of activities in this CoRD includes gathering data from various countries. The collected data comprises percentages, rates, or averages aggregated for each respective country. In this activity, students will watch a video and answer multiple-choice questions based on the video. This introduction activity is to help prevent the formation of biases based solely on the numbers mentioned above.

 

Introduction to Least Squares Regression

Lesson Plan

Worksheet

Worksheet Answers

 

In this activity, the class will begin with a discussion on least squares regression, aiming to understand its purpose. Students will differentiate between independent and dependent variables through interactive questioning. Given a list of example variables, students will select potential cause-and-effect covariational relationships. They will then use their critical thinking skills to select appropriate variables to potentially establish causation. Lastly, students will choose one of the relationships to explore further, define their variables on a PowerPoint slide, and collect data from the website (gapminder.org) for a simple random sample of 25 countries.

 

Scatterplots, Linearity, and Correlation

Lesson Plan

Worksheet

Worksheet Answers

In this activity, students will analyze scatterplots and correlation coefficients for quantitative variables, and use them, when appropriate, to describe covariate relationships. Students will then apply their newfound knowledge to create a scatterplot and calculate the correlation coefficient for their data using Excel. They will create two more slides for their PowerPoint presentation which will contain their scatterplot, a description of the relationship between their chosen variables, and the value of the correlation coefficient with an explanation of its validity.

Modeling and Prediction with Least Squares Regression

Lesson Plan

Worksheet

Excel File

Worksheet Answers

In this activity, students will understand that the least squares regression line is a linear equation that models the relationship between two quantitative variables. They will understand when it is appropriate to use the least squares regression line for prediction through critical thinking questions on data provided by the instructor. Lastly, they will calculate the least squares regression line and evaluate whether it will accurately predict the value of the dependent variable from the value of an independent variable based on the scatterplot and the correlation coefficient they found and determine what would be the best predictor for their data.

 

Correlation versus Causation

Lesson Plan

Worksheet

Worksheet Answers

Presentation Rubric

In this activity, students will be introduced to the concepts of correlation versus causation, lurking variables, confounding, and common response. They will examine some unusual linear relationships on gapminder.org and use their knowledge and critical thinking skills to determine whether a common response or confounding is at play. Finally, students will evaluate the relationships they explored in previous activities and decide whether there was a common response or confounding for their data. They will then add this information to a final slide in their PowerPoint presentation.

 

This work is licensed under CC BY-NC-SA 4.0