Answers Section

MULTIPLE CHOICE

1. ANS: A

2. ANS: D

3. ANS: A

4. ANS: D

5. ANS: D

6. ANS: D

7. ANS: C

8. ANS: C

9. ANS: D

10. ANS: D

11. ANS: C

12. ANS: C

13. ANS: C

14. ANS: A

15. ANS: B

16. ANS: B

17. ANS: D

18. ANS: D

19. ANS: B

MATCHING

20. ANS: B

21. ANS: A

22. ANS: C

23. ANS: D

24. ANS: E

SHORT ANSWER

25. ANS:

No, sample is too small. Also, marks on a mathematics test measure students’ skills at writing mathematics tests, which is not necessarily the same as their ability to understand mathematical logic.

26. ANS:

a) The correlation coefficient is 0.88.

b) Although this coefficient shows a strong positive linear correlation, the sample is far too small to draw any conclusions. Also, the number of passes each player caught could be affected by extraneous variables such as the position played and the proportion of each game that the player was actually on the field.

27. ANS:

The cubic regression has .

28. ANS:

No, the slope indicates only how y varies with x on the line of best fit; it does not give any information about how closely this line fits the data.

29. ANS:

Answers may vary. Valid responses include

a) strong positive correlation, cause-and-effect relationship

b) weak negative correlation, cause-and-effect relationship (the more people, the more fishing and pollution)

c) strong positive correlation, cause-and-effect relationship (high temperature and humidity can make people more susceptible to breathing problems) or common-cause factor (the weather conditions that cause high temperatures and humidity often cause a build-up of airborne pollutants as well)

d) weak or moderate correlation, accidental relationship or common-cause factor (inflation)

e) moderate positive correlation, presumed relationship

30. ANS:

Outliers should be excluded only when there is good reason to believe they are not representative of the population.

31. ANS:

The exponential regression has .

32. ANS:

Answers may vary. Students should identify the independent, dependent, and extraneous variables and describe the effects of the extraneous variables.

33. ANS: (93, 19)

34. ANS:

a) moderate to strong positive linear correlation

b) little or no correlation

c) strong positive linear correlation

d) moderate negative linear correlation

e) strong negative linear correlation

35. ANS:

The quadratic regression has .

36. ANS:

Answers may vary. The scatter plots should have the following characteristics:

a) moderate negative linear correlation, age as the independent variable

b) strong negative linear correlation, oven temperature as the independent variable

c) no correlation, amount of exposure to sunlight as the independent variable

d) strong positive linear correlation, age as the independent variable

37. ANS:

Answers may vary. Possible sources of error include

38. ANS:

Answers may vary.

PROBLEM


39. ANS:


a) A graphing calculator or a spreadsheet can be used to find the correlation coefficient. Its value is 0.920, indicating that there is a strong linear correlation between the sales of cellphones and the numbers of traffic accidents in this city.


b) The strong correlation suggests a cause-and-effect relationship, but the correlation by itself does not prove that cellphone use is causing an increase in traffic accidents. Factors unrelated to cellphones could be responsible for some or all of the increase in accidents. For example, population growth is a possible common-cause factor that could increase the sales of cellphones as well as the number of number of people driving, which is likely to increase the number of traffic accidents.


c) Answers may vary. Traffic congestion is an extraneous variable that could increase the number of accidents, in part because drivers become frustrated and impatient. A recession could be a hidden variable. During a slowdown in the economy, people tend to keep their cars longer, so there could be an increase in accidents due to mechanical failures. Faced with a drop in tax revenues, the municipal and provincial governments might cut back on road maintenance, which could contribute to accidents. Cutbacks in road maintenance and construction also increase traffic congestion and could indirectly increase the number of accidents.


40. ANS:

a) 0.959

b) A calculator or software will give a value of 0.959 29....

c) The two methods give the same result. Note that this correlation coefficient is not accurate beyond three decimal places because the data have only two or three significant digits.

d) There is a strong positive correlation between hand span and height for this sample of adults.


41. ANS:

a)


Year

Snowfall, x

(cm)

Corn Height, y

(cm)


x2


y2


xy

1995

173

182

29 929

33 124

31 486

1996

165

190

27 225

36 100

31 350

1997

152

207

23 104

42 849

31 464

1998

184

180

33 856

32 400

33 120

1999

178

184

31 684

33 856

32 752


Totals

852

943

145 798

178 329

160 172


r = -0.9473


b) The correlation coefficient indicates a strong negative correlation: the height of the corn plants decreases as total snowfall for the previous winter increases.


42. ANS:

Answers may vary. Possible extraneous factors include local fishing, international fishing fleets with factory trawlers, the annual seal hunt, and changes in the ecosystem, especially increases in water temperature.

43. ANS:

a)

b) Answers may vary. Exponential, quadratic, cubic, and power regressions are shown below.

c) The exponential regression model has the highest coefficient of determination. In fact, its value for r2 is very close to 1, indicating an almost perfect fit to the data.

d) Using the exponential regression equation,

After 8.5 h, there will be approximately 35 000 cells.

e) Using the exponential regression equation,

There will be 100 000 cells in about 10.0 h.

44. ANS:

a) y = 2.51x + 2.34

b)

The equation of the line of best fit is y = 2.51x + 2.33.

c) The two equations are almost exactly the same. The small difference is due to rounding in the calculation using the formula.

d) As shown in the calculator screen above, r = 0.86.

e) There is a strong positive correlation between the number of advertisements placed and the sales of stereo systems. This correlation suggests—but does not prove—that the advertisements are effective.

45. ANS:

When a manufacturing process or production line starts up for the first time, it usually requires a number of small adjustments to get it working properly. As a result, the first batches of the product have a relatively high defect rate. This defect rate drops quickly as adjustments are made to correct the defects found in each successive batch. Thus, the defect rate decreases as the total number of products made increases. Once these start-up adjustments have been completed, the defect rate will remain almost constant until the machinery begins to wear out. So, there will be a period when the correlation between the defect rate and the number of items produced is almost zero, followed by a positive correlation when the defect rate starts to increase.




46. ANS:

a)


b) The line of best fit can be found using a graphing calculator, a spreadsheet, or Fathom™.

The equation of the line of best fit is y = 0.224x + 13.2, and r is 0.996. Since the correlation coefficient is nearly 1, the data have almost a perfect linear correlation. The linear model fits the data very closely.


c)

The equation of the curve of best fit is y = 3.50x0.501, which is almost . The coefficient of determination is 0.9997, so this model is also an excellent fit to the data.


d) There is little difference in how well the two models fit the data. However, for a speed of 0 km/h, the linear model predicts a stopping distance of 13.2 m, while the power model gives a value of 0. So, the power regression is more accurate for extrapolations to lower speeds.


e) Using the linear model,

Using the power-regression model,


47. ANS:

Answers may vary. Each study should have an experimental and a control group that are as similar as possible. One possibility is to interchange the two groups halfway through the study. The initial study should measure whether the medication is effective in reducing or preventing the symptoms of asthma. The follow-up study would compare groups taking current medications with one taking the new drug. This study should consider both the effectiveness and the side-effects of the medications.


48. ANS:

a) The amount of sleep is the independent variable, so it is shown on the x-axis.


b) The correlation coefficient can be calculated using the formula below, the linreg(ax+b) instruction on a graphing calculator, or the CORREL function in a spreadsheet.


c) There is a strong positive correlation between amounts of sleep and average marks for these students. However, this correlation does not prove that getting more sleep causes higher marks.

49. ANS:

a) The scatter plot for weight loss versus time on the miracle-shake diet shows a strong positive linear correlation. The regression line is y = 0.395x – 1.46, and the correlation coefficient is 0.99. There appears to be an almost perfect positive correlation between weight loss and time on the program.

b) Use the linear model to predict weight loss after 120 days.

y = 0.395(120) – 1.46

= 45.9

The linear model predicts that a person on the miracle-shake diet for 120 days will lose about 46 kg. However, this prediction is an extrapolation, and rapid weight loss cannot continue indefinitely. The linear model is unlikely to be accurate for diets that last much longer than those in the data set. The dieter’s initial weight will also be a factor. For example, a 60-kg person will not be able to lose 46 kg. Clearly, a dieter’s weight at the start of the diet is an extraneous variable that could have a substantial effect. Overall, there is a significant risk that the predictions using the linear model will be inaccurate.


c) Although a strong correlation appears to exist between weight loss and time on the miracle-shake diet, you cannot conclude that the miracle shakes cause the weight loss. The company guarantees weight loss only if you follow its dietary program. This program may well be a low-fat, low-calorie diet that would result in weight loss with or without the miracle shakes. Other possible extraneous variables include the dieters’ individual metabolisms, body types, initial weights, and changes in physical activity.


d) The sample is small and could have intentional bias. There is no indication that the individuals in the commercial were randomly chosen from the population of miracle-shake customers. Quite likely, the company carefully selected the best success stories. The disclaimer that “Individual results may vary” suggests that the company knows the selected results are not representative and is being careful to avoid legal liability for its claims.



















50. ANS:

a)

b) The linear regression can be done with a graphing calculator, a spreadsheet, or Fathom™. As shown in the spreadsheet screen above, the equation for the line of best fit is y = –2.48x + 51.6 with r = –0.494.

c) There is a moderate negative linear correlation between the sprint times and throwing distances.

d) On the scatter plot, points (7.55, 40) and (7.75, 26) appear to be outliers since they are somewhat removed from the rest of the data.

e) If the two possible outliers are removed, the line of best fit becomes y = –2.17x + 49.0 with r = –0.827.

f) There appears to be a negative linear correlation between the sprint times and throwing distances. In other words, the faster runners tend to throw the ball farther. However, a sample of 12 is too small to make any reliable predictions, and the coach does not have enough data to determine whether the possible outliers really are outliers. The correlation between sprint times and throwing distances may, in fact, be only moderate.

g) Using the regression with the possible outliers,

Using the regression without the possible outliers,

51. ANS:

a)

b) The point (7, 4) could be an outlier. On the scatter plot, this point is somewhat distant from the rest of the data.

c) The linear regression can be done with a graphing calculator, a spreadsheet, or Fathom™. As shown in the spreadsheet screen above, the equation for the line of best fit is y = –0.58x + 12.36 and r = –0.669.

d) Without the outlier, the equation for the line of best fit becomes y = –0.80x + 15.56 with r = –0.992.

e) The outlier had a substantial effect on both the line of best fit and the correlation coefficient. Without the outlier, the data have a nearly perfect negative linear correlation.

f) The sample size is very small.