OCR S1 2013 January — Question 3 12 marks

Exam BoardOCR
ModuleS1 (Statistics 1)
Year2013
SessionJanuary
Marks12
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicLinear regression
TypeComment on reliability/validity of prediction
DifficultyModerate -0.3 This is a standard S1 regression and correlation question requiring routine application of formulas. Parts (i) and (iv) involve straightforward calculations using given summations, part (ii) reads a value from a graph, part (iii) asks for standard reasons about extrapolation/reliability, and part (v) tests understanding that correlation is scale-invariant. All techniques are textbook exercises with no novel problem-solving required, making it slightly easier than average.
Spec5.08a Pearson correlation: calculate pmcc5.08b Linear coding: effect on pmcc5.09a Dependent/independent variables5.09b Least squares regression: concepts5.09c Calculate regression line5.09d Linear coding: effect on regression

The Gross Domestic Product per Capita (GDP), \(x\) dollars, and the Infant Mortality Rate per thousand (IMR), \(y\), of 6 African countries were recorded and summarised as follows. \(n = 6\) \quad \(\sum x = 7000\) \quad \(\sum x^2 = 8700000\) \quad \(\sum y = 456\) \quad \(\sum y^2 = 36262\) \quad \(\sum xy = 509900\)
  1. Calculate the equation of the regression line of \(y\) on \(x\) for these 6 countries. [4]
The original data were plotted on a scatter diagram and the regression line of \(y\) on \(x\) was drawn, as shown below. \includegraphics{figure_3}
  1. The GDP for another country, Tanzania, is 1300 dollars. Use the regression line in the diagram to estimate the IMR of Tanzania. [1]
  2. The GDP for Nigeria is 2400 dollars. Give two reasons why the regression line is unlikely to give a reliable estimate for the IMR for Nigeria. [2]
  3. The actual value of the IMR for Tanzania is 96. The data for Tanzania (\(x = 1300, y = 96\)) is now included with the original 6 countries. Calculate the value of the product moment correlation coefficient, \(r\), for all 7 countries. [4]
  4. The IMR is now redefined as the infant mortality rate per hundred instead of per thousand, and the value of \(r\) is recalculated for all 7 countries. Without calculation state what effect, if any, this would have on the value of \(r\) found in part (iv). [1]

(v)
AnswerMarks Guidance
No effect oeB1, [1] Stay the same oe; Allow just "No"
## (v)
No effect oe | B1, [1] | Stay the same oe; Allow just "No" | Ignore all else

---
The Gross Domestic Product per Capita (GDP), $x$ dollars, and the Infant Mortality Rate per thousand (IMR), $y$, of 6 African countries were recorded and summarised as follows.

$n = 6$ \quad $\sum x = 7000$ \quad $\sum x^2 = 8700000$ \quad $\sum y = 456$ \quad $\sum y^2 = 36262$ \quad $\sum xy = 509900$

\begin{enumerate}[label=(\roman*)]
\item Calculate the equation of the regression line of $y$ on $x$ for these 6 countries. [4]
\end{enumerate}

The original data were plotted on a scatter diagram and the regression line of $y$ on $x$ was drawn, as shown below.

\includegraphics{figure_3}

\begin{enumerate}[label=(\roman*)]
\setcounter{enumi}{1}
\item The GDP for another country, Tanzania, is 1300 dollars. Use the regression line in the diagram to estimate the IMR of Tanzania. [1]

\item The GDP for Nigeria is 2400 dollars. Give two reasons why the regression line is unlikely to give a reliable estimate for the IMR for Nigeria. [2]

\item The actual value of the IMR for Tanzania is 96. The data for Tanzania ($x = 1300, y = 96$) is now included with the original 6 countries. Calculate the value of the product moment correlation coefficient, $r$, for all 7 countries. [4]

\item The IMR is now redefined as the infant mortality rate per hundred instead of per thousand, and the value of $r$ is recalculated for all 7 countries. Without calculation state what effect, if any, this would have on the value of $r$ found in part (iv). [1]
\end{enumerate}

\hfill \mbox{\textit{OCR S1 2013 Q3 [12]}}