PROJECT PART 3: Z-SCORES AND LINEAR REGRESSION
Part 3: Z-scores and Linear Regression This section will further investigate possible outliers in your data and investigate your initial hypothesis that your two variables are associated. Your project should be submitted as a professional report including everything from Part 1 and 2 using the following template. The description in italics indicates the information required in each section. (Part 1 and Part 2 of your project here) Z-Scores and Outliers Above, you decided whether you have any outliers for your two variables. A second definition of an outlier is for a point to lie more than 2 standard deviations from the mean. For each variable, find the data point farthest from the mean (it could be either above or below the mean). Find the z-score for each of these points. Based on the z-score, is either of these points considered an outlier by this definition? If so, was it also an outlier based on the fences for that variable (which you found in part 2)? Show all your calculations typed neatly using appropriate word processing software with a mathematics package (for example Equation Editor or MathType in MS Word). Linear Regression and Correlation In your research proposal (part 1) you discussed the relationship that might exist between your two quantitative variables. You are now going to examine this relationship using Linear Regression. Based on what you wrote in Part 1 of your project, state which variable you are selecting to be your explanatory (x) variable and which variable you are selecting to be your response (y) variable and explain why you made this decision. Create a scatterplot of your explanatory and response variable. It must look professional (hand drawn scatterplots will receive no credit). Indicate what technology was used and a brief description of the process. Based on your scatterplot, discuss the direction, form, and strength of the association using appropriate statistical terminology. Are there any suspected outliers or clusters? Is a linear model appropriate based on your scatterplot? Find the linear regression equation and report the correlation coefficient using your choice of technology. Indicate what technology was used and a brief description of the process. Does your correlation coefficient confirm your observations of the scatterplot from the previous section? Explain why or why not. Discussion Do the results of your linear regression and correlation analysis appear to confirm or contradict your initial belief that these variables were associated in some way? Critically evaluate this conclusion by addressing both the evidence in support of your conclusion about the purported relationship between your two variables, as well as cautions or problems with your data that would weaken your case. The form, shape, and strength of your scatterplot, as well as the strength of the correlation coefficient should be discussed in this evaluation. If your data contained outliers, explain how that impacts the validity of the model. Additionally, given the sampling method you used and the limitations you reported in part II, discuss how this might limit how your answer to your original supposition generalizes to the greater population.