Walden University Week 10 Multiple Regression Using Dummy Variable Paper

Discussion:

Week 10: Dummy Variables, Regression Diagnostics, and Model Evaluation

By now, you have gained quite a bit of experience estimating regression models. Perhaps one thing you have noticed is that you have not been able to include categorical predictor/control variables. In social science, many of the predictor variables that we might want to use are inherently qualitative and measured categorically (i.e., race, gender, political party affiliation, etc.). This week, you will learn how to use categorical variables in our multiple regression models.

While we have discussed a great deal about the benefits of multiple regression, we have been reticent about what can go wrong in our models. For our models to provide accurate estimates, we must adhere to a set of assumptions. Given the dynamics of the social world, data gathered are often far from perfect. This week, you will examine all of the assumptions of multiple regression and how you can test for them.

Learning Objectives

Students will:
  • Analyze multiple regression testing using dummy variables
  • Analyze measures for multiple regression testing
  • Construct research questions
  • Evaluate assumptions of multiple regression testing
  • Analyze assumptions of correlation and bivariate regression
  • Analyze implications for social change

Learning Resources

Required Readings

Wagner, III, W. E. (2020). Using IBM® SPSS® statistics for research methods and social science statistics (7th ed.). Thousand Oaks, CA: Sage Publications.

  • Chapter 2, “Transforming Variables”
  • Chapter 11, “Editing Output” (previously read in Week 2, 3, 4, 5. 6, 7, 8, and 9)

Allison, P. D. (1999). Multiple regression: A primer. Thousand Oaks, CA: Pine Forge Press/Sage Publications.
Multiple Regression: A Primer, by Allison, P. D. Copyright 1998 by Sage College. Reprinted by permission of Sage College via the Copyright Clearance Center.

  • Chapter 6, “What are the Assumptions of Multiple Regression?” (pp. 119–136)

Allison, P. D. (1999). Multiple regression: A primer. Thousand Oaks, CA: Pine Forge Press/Sage Publications.
Multiple Regression: A Primer, by Allison, P. D. Copyright 1998 by Sage College. Reprinted by permission of Sage College via the Copyright Clearance Center.

  • Chapter 7, “What can be done about Multicollinearity?” (pp. 137–152)

Multiple Regression: A Primer, by Allison, P. D. Copyright 1998 by Sage College. Reprinted by permission of Sage College via the Copyright Clearance Center.

Warner, R. M. (2012). Applied statistics from bivariate through multivariate techniques (2nd ed.). Thousand Oaks, CA: Sage Publications.
Applied Statistics From Bivariate Through Multivariate Techniques, 2nd Edition by Warner, R.M. Copyright 2012 by Sage College. Reprinted by permission of Sage College via the Copyright Clearance Center.

  • Chapter 12, “Dummy Predictor Variables in Multiple Regression”

Applied Statistics From Bivariate Through Multivariate Techniques, 2nd Edition by Warner, R.M. Copyright 2012 by Sage College. Reprinted by permission of Sage College via the Copyright Clearance Center.

Non-Normally Distributed Errors. (1991). In J. Fox (Ed.), Regression Diagnostics. (pp. 41-49). Thousand Oaks, CA: SAGE Publications, Inc.

Fox, J. (1991). Regression diagnostics. Thousand Oaks, CA: SAGE Publications.

Discrete Data. (1991). In J. Fox (Ed.), Regression Diagnostics. (pp. 62-67). Thousand Oaks, CA: SAGE Publications, Inc.

Nonconstant Error Variance. (1991). In J. Fox (Ed.), Regression Diagnostics. (pp. 49-54). Thousand Oaks, CA: SAGE Publications, Inc.

Nonlinearity. (1991). In J. Fox (Ed.), Regression Diagnostics. (pp. 54-62). Thousand Oaks, CA: SAGE Publications, Inc.

Outlying and Influential Data. (1991). In J. Fox (Ed.), Regression Diagnostics. (pp. 22-41). Thousand Oaks, CA: SAGE Publications, Inc.

Fox, J. (Ed.). (1991). Regression diagnostics. Thousand Oaks, CA: SAGE Publications.

  • Chapter 3, “Outlying and Influential Data” (pp. 22–41)
  • Chapter 4, “Non-Normally Distributed Errors” (pp. 41–49)
  • Chapter 5, “Nonconstant Error Variance” (pp. 49–54)
  • Chapter 6, “Nonlinearity” (pp. 54–62)
  • Chapter 7, “Discrete Data” (pp. 62–67)

Note: You will access these chapters through the Walden Library databases.

Document: Walden University: Research Design Alignment Table

Datasets

Document: Data Set 2014 General Social Survey (dataset file)
Use this dataset to complete this week’s Discussion.
Note: You will need the SPSS software to open this dataset.

Document: Data Set Afrobarometer (dataset file)
Use this dataset to complete this week’s Assignment.
Note: You will need the SPSS software to open this dataset.

Document: High School Longitudinal Study 2009 Dataset (dataset file)
Use this dataset to complete this week’s Assignment.
Note: You will need the SPSS software to open this dataset.

Required Media

Laureate Education (Producer). (2016m). Regression diagnostics and model evaluation [Video file]. Baltimore, MD: Author.
Note: The approximate length of this media piece is 7 minutes.
In this media program, Dr. Matt Jones demonstrates regression diagnostics and model evaluation using the SPSS software.

–Downloads–Download Video w/CCDownload AudioDownload Transcript

Laureate Education (Producer). (2016). Dummy variables [Video file]. Baltimore, MD: Author.
Note: This media program is approximately 12 minutes.
In this media program, Dr. Matt Jones demonstrates dummy variables using the SPSS software.

–Downloads–Download Video w/CCDownload AudioDownload Transcript

Optional Resources

Skill Builder: Interpreting Regression Coefficients for Dummy-Coded Variables
To access these Skill Builders, navigate back to your Blackboard Course Home page, and locate “Skill Builders” in the left navigation pane. From there, click on the relevant Skill Builder link for this week.

You are encouraged to click through these and all Skill Builders to gain additional practice with these concepts. Doing so will bolster your knowledge of the concepts you’re learning this week and throughout the course.


Discussion: Estimating Models Using Dummy Variables

You have had plenty of opportunity to interpret coefficients for metric variables in regression models. Using and interpreting categorical variables takes just a little bit of extra practice. In this Discussion, you will have the opportunity to practice how to recode categorical variables so they can be used in a regression model and how to properly interpret the coefficients. Additionally, you will gain some practice in running diagnostics and identifying any potential problems with the model.

To prepare for this Discussion:

  • Review Warner’s Chapter 12 and Chapter 2 of the Wagner course text and the media program found in this week’s Learning Resources and consider the use of dummy variables.
  • Create a research question using the General Social Survey dataset that can be answered by multiple regression. Using the SPSS software, choose a categorical variable to dummy code as one of your predictor variables.

By Day 3

Estimate a multiple regression model that answers your research question. Post your response to the following:

  1. What is your research question?
  2. Interpret the coefficients for the model, specifically commenting on the dummy variable.
  3. Run diagnostics for the regression model. Does the model meet all of the assumptions? Be sure and comment on what assumptions were not met and the possible implications. Is there any possible remedy for one the assumption violations?

Be sure to support your Main Post and Response Post with reference to the week’s Learning Resources and other scholarly evidence in APA Style.

Expert Solution Preview

Introduction:

As a medical professor tasked with creating assignments and evaluating student performance, it is important to understand how to incorporate categorical variables into regression models. Additionally, understanding the assumptions of multiple regression and how to test for them is crucial in ensuring accurate estimates. In this discussion, we will focus on estimating models using dummy variables and running diagnostics to identify any potential problems with the model.

Research Question:

Using the General Social Survey dataset, our research question is: “Does education level and political ideology predict attitudes towards government-funded healthcare?”

Interpreting Coefficients:

We dummy-coded the “political ideology” variable with “1” representing liberal and “0” representing conservative. The regression model yielded the following coefficients:

– Education Level: B = 0.31, p

Share This Post

Email
WhatsApp
Facebook
Twitter
LinkedIn
Pinterest
Reddit

Order a Similar Paper and get 15% Discount on your First Order

Related Questions