| Exam Board | Edexcel |
|---|---|
| Module | S1 (Statistics 1) |
| Year | 2014 |
| Session | June |
| Marks | 12 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Linear regression |
| Type | Calculate y on x from raw data table |
| Difficulty | Moderate -0.8 This is a standard S1 linear regression question requiring straightforward application of given formulas with provided summary statistics. All calculations follow routine procedures (Syy, PMCC, regression equation, prediction) with no conceptual challenges or novel problem-solving required. Easier than average A-level due to being purely procedural with given summaries. |
| Spec | 2.02c Scatter diagrams and regression lines5.08a Pearson correlation: calculate pmcc5.09a Dependent/independent variables5.09c Calculate regression line5.09e Use regression: for estimation in context |
| Age (x) | 20 | 25 | 30 | 45 | 55 | 60 | 65 | 70 |
| Volume (y) | 74 | 76 | 77 | 72 | 68 | 67 | 64 | 62 |
| Answer | Marks | Guidance |
|---|---|---|
| (a) \(S_{yy} = 39418 - \frac{560^2}{8} = 218\) | M1, A1cao (2) | For correct expression for \(S_{yy}\); condone 218.0 |
| (b) \([r =] \frac{-710}{\sqrt{218 \times 2587.5}} = -0.945344...\) awrt \(-0.945\) | M1, A1 (2) | For attempt at correct formula with their \(S_{yy}\) (>0) and given \(S_{xy}\), \(S_{xx}\) in correct places. Condone missing "−" for M1. −0.95 with no expression scores M1A0, awrt −0.945 with no working scores M1A1 |
| (c) As age increases, volume/blood (pumped) decreases (o.e.) | B1 (1) | Must mention "age" and "volume" or "blood (pumped)". No ft. |
| (d) Yes as \(r\) is close to \(-1\) (if \(r < -0.5\)) or Yes as \(r\) is close to \(1\) (if \(r > 0.5\)). For ft, if \(-0.5 \le r \le 0.5\) "No since \(r\) is close to 0" | B1ft, dB1 (1) | Must comment on supporting and state: high/strong/clear (negative or positive) correlation 'points lie close to a line' is B0 since there is no evidence of this. Do not follow through if \( |
| (e) \(b = \frac{-710}{2587.5} = -0.27439...\) (allow \(\frac{-284}{1035}\)) awrt \(-0.27\) | M1 A1 (4) | 1st M1 for correct expression for \(b\). Condone missing "−"; 1st A1 for awrt −0.27 or allow exact fraction. |
| \(a = \frac{560}{8} - \text{their } b' \times \frac{370}{8} = [82.690...]\) so \(y = 82.7 - 0.274x\) | M1, A1 | 2nd M1 for correct method for \(a\). Follow through their value of \(b\) and \(x\) with \(a =\) awrt 82.7 and \(b =\) awrt −0.274. No fractions. |
| (f)(i) \((y = 82.7 - 0.274 \times 40 =) 71.74...\) \(=\) awrt \(72\) | B1 (2) | |
| (f)(ii) Should be reliable since interpolation (o.e.) | B1 | [12] |
**(a)** $S_{yy} = 39418 - \frac{560^2}{8} = 218$ | M1, A1cao (2) | For correct expression for $S_{yy}$; condone 218.0
**(b)** $[r =] \frac{-710}{\sqrt{218 \times 2587.5}} = -0.945344...$ awrt $-0.945$ | M1, A1 (2) | For attempt at correct formula with their $S_{yy}$ (>0) and given $S_{xy}$, $S_{xx}$ in correct places. Condone missing "−" for M1. −0.95 with no expression scores M1A0, awrt −0.945 with no working scores M1A1
**(c)** As age increases, volume/blood (pumped) decreases (o.e.) | B1 (1) | Must mention "age" and "volume" or "blood (pumped)". No ft.
**(d)** Yes as $r$ is close to $-1$ (if $r < -0.5$) or Yes as $r$ is close to $1$ (if $r > 0.5$). For ft, if $-0.5 \le r \le 0.5$ "No since $r$ is close to 0" | B1ft, dB1 (1) | Must comment on supporting and state: high/strong/clear (negative or positive) correlation 'points lie close to a line' is B0 since there is no evidence of this. Do not follow through if $|r| > 1$.
**(e)** $b = \frac{-710}{2587.5} = -0.27439...$ (allow $\frac{-284}{1035}$) awrt $-0.27$ | M1 A1 (4) | 1st M1 for correct expression for $b$. Condone missing "−"; 1st A1 for awrt −0.27 or allow exact fraction.
$a = \frac{560}{8} - \text{their } b' \times \frac{370}{8} = [82.690...]$ so $y = 82.7 - 0.274x$ | M1, A1 | 2nd M1 for correct method for $a$. Follow through their value of $b$ and $x$ with $a =$ awrt 82.7 and $b =$ awrt −0.274. No fractions.
**(f)(i)** $(y = 82.7 - 0.274 \times 40 =) 71.74...$ $=$ awrt $72$ | B1 (2) |
**(f)(ii)** Should be reliable since interpolation (o.e.) | B1 | [12]
---
\begin{enumerate}
\item A medical researcher is studying the relationship between age ( $x$ years) and volume of blood ( $y \mathrm { ml }$ ) pumped by each contraction of the heart. The researcher obtained the following data from a random sample of 8 patients.
\end{enumerate}
\begin{center}
\begin{tabular}{ | c | c | c | c | c | c | c | c | c | }
\hline
Age (x) & 20 & 25 & 30 & 45 & 55 & 60 & 65 & 70 \\
\hline
Volume (y) & 74 & 76 & 77 & 72 & 68 & 67 & 64 & 62 \\
\hline
\end{tabular}
\end{center}
[You may use $\sum x = 370 , \mathrm {~S} _ { x x } = 2587.5 , \sum y = 560 , \sum y ^ { 2 } = 39418 , \mathrm {~S} _ { x y } = - 710$ ]\\
(a) Calculate $\mathrm { S } _ { y y }$\\
(b) Calculate the product moment correlation coefficient for these data.\\
(c) Interpret your value of the correlation coefficient.
The researcher believes that a linear regression model may be appropriate to describe these data.\\
(d) State, giving a reason, whether or not your value of the correlation coefficient supports the researcher's belief.\\
(e) Find the equation of the regression line of $y$ on $x$, giving your answer in the form $y = a + b x$
Jack is a 40-year-old patient.\\
(f) (i) Use your regression line to estimate the volume of blood pumped by each contraction of Jack's heart.\\
(ii) Comment, giving a reason, on the reliability of your estimate.\\
\hfill \mbox{\textit{Edexcel S1 2014 Q1 [12]}}