| Exam Board | OCR MEI |
|---|---|
| Module | Further Statistics Major (Further Statistics Major) |
| Session | Specimen |
| Marks | 11 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Linear regression |
| Type | Comment on reliability/validity of prediction |
| Difficulty | Standard +0.3 This is a straightforward interpretation question requiring reading residuals from a graph, substituting into a regression equation, and commenting on interpolation vs extrapolation. The multi-part structure and need to interpret correlation magnitudes adds slight complexity, but these are standard A-level statistics tasks requiring no novel insight or complex calculations. |
| Spec | 5.08a Pearson correlation: calculate pmcc5.08c Pearson: measure of straight-line fit5.08g Compare: Pearson vs Spearman5.09e Use regression: for estimation in context |
| Variable | Definition |
| Body mass | Mass of animal in kg |
| Brain mass | Mass of brain in g |
| Hours of sleep/day | Number of hours per day spent asleep |
| Life span | How many years the animal lives |
| Danger | A measure of how dangerous the animal's situation is when asleep, taking into account predators and how protected the animal's den is: higher value indicates greater danger. |
| Correlations (pmcc) | Body Mass | Brain Mass | Hours of sleep/day | Life span | Danger |
| Body Mass | 1.00 | ||||
| Brain Mass | 0.93 | 1.00 | |||
| Hours of sleep/day | -0.31 | -0.36 | 1.00 | ||
| Life span | 0.30 | 0.51 | -0.41 | 1.00 | |
| Danger | 0.13 | 0.15 | -0.59 | 0.06 | 1.00 |
| Effect size | ||
| 0.1 | Small | ||
| 0.3 | Medium | ||
| 0.5 | Large |
| Answer | Marks | Guidance |
|---|---|---|
| 3 | (i) | At (24,11) |
| Answer | Marks |
|---|---|
| = 2.81 | B1 |
| Answer | Marks |
|---|---|
| [3] | 1.1a |
| Answer | Marks |
|---|---|
| 1.1 | E |
| Answer | Marks | Guidance |
|---|---|---|
| 3 | (ii) | (A) |
| Answer | Marks |
|---|---|
| line so probably a good estimate | B1 |
| Answer | Marks |
|---|---|
| [2] | I |
| Answer | Marks |
|---|---|
| 3.5a | Must mention both |
| (B) | x = 16 y = 11.17 |
| Extrapolation so probably not reliable | E |
| Answer | Marks |
|---|---|
| [2] | 1.1 |
| Answer | Marks | Guidance |
|---|---|---|
| 3 | (iii) | P |
| Answer | Marks |
|---|---|
| asleep | E1 |
| Answer | Marks |
|---|---|
| [2] | 2.2b |
| 2.2b | Or any other relevant comment, |
| Answer | Marks | Guidance |
|---|---|---|
| 3 | (iv) | There are outliers which affect the size of the |
| Answer | Marks |
|---|---|
| data with these outliers removed | E1 |
| Answer | Marks |
|---|---|
| [2] | 3.5b |
| 3.5c | Accept ‘is suitable’. |
Question 3:
3 | (i) | At (24,11)
Residual
(cid:32)11(cid:16)(17.138(cid:16)0.3727(cid:117)24)(cid:32)11(cid:16)8.1932
= 2.81 | B1
M1
A1
[3] | 1.1a
1.1
1.1 | E
M
Subtraction other way round
scores M1 only
3 | (ii) | (A) | x = 26 y = 7.45
Interpolation and points lie fairly close to the
line so probably a good estimate | B1
E1
[2] | I
C
1.1
3.5a | Must mention both
(B) | x = 16 y = 11.17
Extrapolation so probably not reliable | E
B1
E1
[2] | 1.1
3.5b
3 | (iii) | P
The only factor with a large effect size when
S
correlated with hours of sleep is danger
It seems that the more dangerous the
animal’s situation, the less time it spends
asleep | E1
E1
[2] | 2.2b
2.2b | Or any other relevant comment,
e.g. stating that the data do not
demonstrate causality, or saying
something relevant about the
other factors
3 | (iv) | There are outliers which affect the size of the
pmcc …
A linear model may well be suitable for the
data with these outliers removed | E1
E1
[2] | 3.5b
3.5c | Accept ‘is suitable’.
Or any othNer comment, e.g.
redraw scatter diagram (or
recalculate pmcc) without
outliers
3 A researcher is investigating factors that might affect how many hours per day different species of mammals spend asleep.
First she investigates human beings. She collects data on body mass index, $x$, and hours of sleep, $y$, for a random sample of people. A scatter diagram of the data is shown in Fig. 3.1 together with the regression line of $y$ on $x$.
\begin{figure}[h]
\begin{center}
\includegraphics[alt={},max width=\textwidth]{e6ee3a4a-3e76-4422-9a78-17b64b458f83-04_885_1584_598_274}
\captionsetup{labelformat=empty}
\caption{Fig. 3.1}
\end{center}
\end{figure}
\begin{enumerate}[label=(\roman*)]
\item Calculate the residual for the data point which has the residual with the greatest magnitude.
\item Use the equation of the regression line to estimate the mean number of hours spent asleep by a person with body mass index\\
(A) 26,\\
(B) 16,\\
commenting briefly on each of your predictions.
The researcher then collects additional data for a large number of species of mammals and analyses different factors for effect size. Definitions of the variables measured for a typical animal of the species, the correlations between these variables, and guidelines often used when considering effect size are given in Fig. 3.2.
\begin{center}
\begin{tabular}{|l|l|}
\hline
Variable & Definition \\
\hline
Body mass & Mass of animal in kg \\
\hline
Brain mass & Mass of brain in g \\
\hline
Hours of sleep/day & Number of hours per day spent asleep \\
\hline
Life span & How many years the animal lives \\
\hline
Danger & A measure of how dangerous the animal's situation is when asleep, taking into account predators and how protected the animal's den is: higher value indicates greater danger. \\
\hline
\end{tabular}
\end{center}
\begin{center}
\begin{tabular}{|l|l|l|l|l|l|}
\hline
Correlations (pmcc) & Body Mass & Brain Mass & Hours of sleep/day & Life span & Danger \\
\hline
Body Mass & 1.00 & & & & \\
\hline
Brain Mass & 0.93 & 1.00 & & & \\
\hline
Hours of sleep/day & -0.31 & -0.36 & 1.00 & & \\
\hline
Life span & 0.30 & 0.51 & -0.41 & 1.00 & \\
\hline
Danger & 0.13 & 0.15 & -0.59 & 0.06 & 1.00 \\
\hline
\end{tabular}
\end{center}
\begin{table}[h]
\begin{center}
\begin{tabular}{ | c | c | }
\hline
\begin{tabular}{ c }
Product moment \\
correlation coefficient \\
\end{tabular} & Effect size \\
\hline
0.1 & Small \\
\hline
0.3 & Medium \\
\hline
0.5 & Large \\
\hline
\end{tabular}
\captionsetup{labelformat=empty}
\caption{Fig. 3.2}
\end{center}
\end{table}
\item State two conclusions the researcher might draw from these tables, relevant to her investigation into how many hours mammals spend asleep.
One of the researcher's students notices the high correlation between body mass and brain mass and produces a scatter diagram for these two variables, shown in Fig. 3.3 below.
\begin{figure}[h]
\begin{center}
\includegraphics[alt={},max width=\textwidth]{e6ee3a4a-3e76-4422-9a78-17b64b458f83-05_675_698_1802_735}
\captionsetup{labelformat=empty}
\caption{Fig. 3.3}
\end{center}
\end{figure}
\item Comment on the suitability of a linear model for these two variables.
\end{enumerate}
\hfill \mbox{\textit{OCR MEI Further Statistics Major Q3 [11]}}