OCR MEI Further Statistics A AS 2021 November — Question 6 11 marks

Exam BoardOCR MEI
ModuleFurther Statistics A AS (Further Statistics A AS)
Year2021
SessionNovember
Marks11
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicLinear regression
TypeAssess model appropriateness from context
DifficultyModerate -0.3 This is a straightforward linear regression question requiring standard calculations (regression line equation) and interpretation (model appropriateness, extrapolation reliability). All techniques are routine for Further Statistics AS level, though the multi-part structure and contextual reasoning elevate it slightly above pure mechanical calculation. The mathematical content is entirely standard with no novel problem-solving required.
Spec5.09a Dependent/independent variables5.09b Least squares regression: concepts5.09c Calculate regression line5.09d Linear coding: effect on regression5.09e Use regression: for estimation in context

6 A health researcher is investigating the relationship between age and maximum heart rate. A commonly quoted formula states that 'maximum heart rate \(= 220\) - age in years'. The researcher wants to check if this formula is a satisfactory model for people who work in the large hospital where she is employed. The researcher selects a random sample of 20 people who work in her hospital, and measures their maximum heart rates.
  1. Explain why the researcher selects a sample, rather than using all of the people who work in the hospital. The ages, \(x\) years, and maximum heart rates, \(y\) beats per minute, of the people in the researcher's sample are summarised as follows. \(n = 20 \quad \sum x = 922 \quad \sum y = 3638 \quad \sum x ^ { 2 } = 47250 \quad \sum y ^ { 2 } = 664610 \quad \sum x y = 164998\) These data are illustrated below. \includegraphics[max width=\textwidth, alt={}, center]{5be067ff-4668-48d6-8ed2-b8dfa3e678f7-5_758_1246_1027_244}
    1. Draw the line which represents the formula 'maximum heart rate \(= 220 -\) age in years' on the copy of the scatter diagram in the Printed Answer Booklet.
    2. Comment on how well this model fits the data.
  2. Determine the equation of the regression line of maximum heart rate on age.
  3. Use the equation of the regression line to predict the values of the maximum heart rate for each of the following ages.

Question 6:
AnswerMarks Guidance
6(a) Because it would be too time consuming and/or
expensive to use all of the people in the hospitalE1
[1]2.4
6(b) (i)
[1]1.1 Line should pass through (20, 200) and ( 80, 140)
6(b) (ii)
points are above the line, with some quite a long way
AnswerMarks Guidance
aboveE1
[1]2.4 Or e.g.: The model seems to be reasonably good for younger
people but underpredicts MHR for older people
Allow any reasonable answer
AnswerMarks Guidance
6(c) x =46.1, y =181.9
S 164998−(922×3638/20) −2713.8
b= xy = =
S 347250−9222/20 4745.8
xx
= −0.5718
(OR
164998/20−(46.1×181.9) −135.69
b= = =−0.5718)
347250/20−46.12 237.29
hence least squares regression line is:
y−181.9=−0.5718(x−46.1)
AnswerMarks
⇒ y=−0.5718x+208.3M1
A1
M1
A1
AnswerMarks
[4]1.1a
1.1
1.1
AnswerMarks
1.1For attempt at gradient (b)
For −0.5718 Allow −0.572
For equation of line
FT their b < 0 for complete equation in the form y = mx+c
AnswerMarks Guidance
6(d) Prediction for 40 years is 185.4
Prediction for 5 years is 205.4B1
B1
AnswerMarks
[2]1.1
1.1FT their equation (not y = 220 – x) provided b < 0 Allow 185
FT their equation (not y = 220 – x) provided b < 0 Allow 205
AnswerMarks Guidance
6(e) Prediction for 40 years is reliable as interpolation and
points reasonably close to line
Prediction for 5 years is less likely to be reliable as
AnswerMarks
extrapolationE1
E1
AnswerMarks
[2]3.5a
3.5b
Question 6:
6 | (a) | Because it would be too time consuming and/or
expensive to use all of the people in the hospital | E1
[1] | 2.4
6 | (b) | (i) | Line correctly plotted | B1
[1] | 1.1 | Line should pass through (20, 200) and ( 80, 140)
6 | (b) | (ii) | The model seemed pretty unsatisfactory as most of the
points are above the line, with some quite a long way
above | E1
[1] | 2.4 | Or e.g.: The model seems to be reasonably good for younger
people but underpredicts MHR for older people
Allow any reasonable answer
6 | (c) | x =46.1, y =181.9
S 164998−(922×3638/20) −2713.8
b= xy = =
S 347250−9222/20 4745.8
xx
= −0.5718
(OR
164998/20−(46.1×181.9) −135.69
b= = =−0.5718)
347250/20−46.12 237.29
hence least squares regression line is:
y−181.9=−0.5718(x−46.1)
⇒ y=−0.5718x+208.3 | M1
A1
M1
A1
[4] | 1.1a
1.1
1.1
1.1 | For attempt at gradient (b)
For −0.5718 Allow −0.572
For equation of line
FT their b < 0 for complete equation in the form y = mx+c
6 | (d) | Prediction for 40 years is 185.4
Prediction for 5 years is 205.4 | B1
B1
[2] | 1.1
1.1 | FT their equation (not y = 220 – x) provided b < 0 Allow 185
FT their equation (not y = 220 – x) provided b < 0 Allow 205
6 | (e) | Prediction for 40 years is reliable as interpolation and
points reasonably close to line
Prediction for 5 years is less likely to be reliable as
extrapolation | E1
E1
[2] | 3.5a
3.5b
6 A health researcher is investigating the relationship between age and maximum heart rate.

A commonly quoted formula states that 'maximum heart rate $= 220$ - age in years'. The researcher wants to check if this formula is a satisfactory model for people who work in the large hospital where she is employed. The researcher selects a random sample of 20 people who work in her hospital, and measures their maximum heart rates.
\begin{enumerate}[label=(\alph*)]
\item Explain why the researcher selects a sample, rather than using all of the people who work in the hospital.

The ages, $x$ years, and maximum heart rates, $y$ beats per minute, of the people in the researcher's sample are summarised as follows.\\
$n = 20 \quad \sum x = 922 \quad \sum y = 3638 \quad \sum x ^ { 2 } = 47250 \quad \sum y ^ { 2 } = 664610 \quad \sum x y = 164998$

These data are illustrated below.\\
\includegraphics[max width=\textwidth, alt={}, center]{5be067ff-4668-48d6-8ed2-b8dfa3e678f7-5_758_1246_1027_244}
\item \begin{enumerate}[label=(\roman*)]
\item Draw the line which represents the formula 'maximum heart rate $= 220 -$ age in years' on the copy of the scatter diagram in the Printed Answer Booklet.
\item Comment on how well this model fits the data.
\end{enumerate}\item Determine the equation of the regression line of maximum heart rate on age.
\item Use the equation of the regression line to predict the values of the maximum heart rate for each of the following ages.

\begin{itemize}
  \item 40 years
  \item 5 years
\item Comment on the reliability of your predictions in part (d).
\end{itemize}
\end{enumerate}

\hfill \mbox{\textit{OCR MEI Further Statistics A AS 2021 Q6 [11]}}