| Exam Board | OCR MEI |
|---|---|
| Module | Further Statistics A AS (Further Statistics A AS) |
| Year | 2021 |
| Session | November |
| Marks | 11 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Linear regression |
| Type | Assess model appropriateness from context |
| Difficulty | Moderate -0.3 This is a straightforward linear regression question requiring standard calculations (regression line equation) and interpretation (model appropriateness, extrapolation reliability). All techniques are routine for Further Statistics AS level, though the multi-part structure and contextual reasoning elevate it slightly above pure mechanical calculation. The mathematical content is entirely standard with no novel problem-solving required. |
| Spec | 5.09a Dependent/independent variables5.09b Least squares regression: concepts5.09c Calculate regression line5.09d Linear coding: effect on regression5.09e Use regression: for estimation in context |
| Answer | Marks | Guidance |
|---|---|---|
| 6 | (a) | Because it would be too time consuming and/or |
| expensive to use all of the people in the hospital | E1 | |
| [1] | 2.4 | |
| 6 | (b) | (i) |
| [1] | 1.1 | Line should pass through (20, 200) and ( 80, 140) |
| 6 | (b) | (ii) |
| Answer | Marks | Guidance |
|---|---|---|
| above | E1 | |
| [1] | 2.4 | Or e.g.: The model seems to be reasonably good for younger |
| Answer | Marks | Guidance |
|---|---|---|
| 6 | (c) | x =46.1, y =181.9 |
| Answer | Marks |
|---|---|
| ⇒ y=−0.5718x+208.3 | M1 |
| Answer | Marks |
|---|---|
| [4] | 1.1a |
| Answer | Marks |
|---|---|
| 1.1 | For attempt at gradient (b) |
| Answer | Marks | Guidance |
|---|---|---|
| 6 | (d) | Prediction for 40 years is 185.4 |
| Prediction for 5 years is 205.4 | B1 |
| Answer | Marks |
|---|---|
| [2] | 1.1 |
| 1.1 | FT their equation (not y = 220 – x) provided b < 0 Allow 185 |
| Answer | Marks | Guidance |
|---|---|---|
| 6 | (e) | Prediction for 40 years is reliable as interpolation and |
| Answer | Marks |
|---|---|
| extrapolation | E1 |
| Answer | Marks |
|---|---|
| [2] | 3.5a |
Question 6:
6 | (a) | Because it would be too time consuming and/or
expensive to use all of the people in the hospital | E1
[1] | 2.4
6 | (b) | (i) | Line correctly plotted | B1
[1] | 1.1 | Line should pass through (20, 200) and ( 80, 140)
6 | (b) | (ii) | The model seemed pretty unsatisfactory as most of the
points are above the line, with some quite a long way
above | E1
[1] | 2.4 | Or e.g.: The model seems to be reasonably good for younger
people but underpredicts MHR for older people
Allow any reasonable answer
6 | (c) | x =46.1, y =181.9
S 164998−(922×3638/20) −2713.8
b= xy = =
S 347250−9222/20 4745.8
xx
= −0.5718
(OR
164998/20−(46.1×181.9) −135.69
b= = =−0.5718)
347250/20−46.12 237.29
hence least squares regression line is:
y−181.9=−0.5718(x−46.1)
⇒ y=−0.5718x+208.3 | M1
A1
M1
A1
[4] | 1.1a
1.1
1.1
1.1 | For attempt at gradient (b)
For −0.5718 Allow −0.572
For equation of line
FT their b < 0 for complete equation in the form y = mx+c
6 | (d) | Prediction for 40 years is 185.4
Prediction for 5 years is 205.4 | B1
B1
[2] | 1.1
1.1 | FT their equation (not y = 220 – x) provided b < 0 Allow 185
FT their equation (not y = 220 – x) provided b < 0 Allow 205
6 | (e) | Prediction for 40 years is reliable as interpolation and
points reasonably close to line
Prediction for 5 years is less likely to be reliable as
extrapolation | E1
E1
[2] | 3.5a
3.5b
6 A health researcher is investigating the relationship between age and maximum heart rate.
A commonly quoted formula states that 'maximum heart rate $= 220$ - age in years'. The researcher wants to check if this formula is a satisfactory model for people who work in the large hospital where she is employed. The researcher selects a random sample of 20 people who work in her hospital, and measures their maximum heart rates.
\begin{enumerate}[label=(\alph*)]
\item Explain why the researcher selects a sample, rather than using all of the people who work in the hospital.
The ages, $x$ years, and maximum heart rates, $y$ beats per minute, of the people in the researcher's sample are summarised as follows.\\
$n = 20 \quad \sum x = 922 \quad \sum y = 3638 \quad \sum x ^ { 2 } = 47250 \quad \sum y ^ { 2 } = 664610 \quad \sum x y = 164998$
These data are illustrated below.\\
\includegraphics[max width=\textwidth, alt={}, center]{5be067ff-4668-48d6-8ed2-b8dfa3e678f7-5_758_1246_1027_244}
\item \begin{enumerate}[label=(\roman*)]
\item Draw the line which represents the formula 'maximum heart rate $= 220 -$ age in years' on the copy of the scatter diagram in the Printed Answer Booklet.
\item Comment on how well this model fits the data.
\end{enumerate}\item Determine the equation of the regression line of maximum heart rate on age.
\item Use the equation of the regression line to predict the values of the maximum heart rate for each of the following ages.
\begin{itemize}
\item 40 years
\item 5 years
\item Comment on the reliability of your predictions in part (d).
\end{itemize}
\end{enumerate}
\hfill \mbox{\textit{OCR MEI Further Statistics A AS 2021 Q6 [11]}}