| Exam Board | Edexcel |
|---|---|
| Module | S1 (Statistics 1) |
| Year | 2013 |
| Session | June |
| Marks | 11 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Bivariate data |
| Type | Calculate summary statistics (Sxx, Syy, Sxy) |
| Difficulty | Moderate -0.3 This is a standard S1 bivariate data question requiring routine application of summary statistics formulas (Syy = Σy² - (Σy)²/n, Sxy = Σxy - ΣxΣy/n) and PMCC calculation. Parts (d) and (e) require conceptual understanding of how adding a data point at the mean affects statistics, which elevates it slightly above pure calculation, but this is still a typical textbook exercise with well-rehearsed techniques and no novel problem-solving required. |
| Spec | 5.08a Pearson correlation: calculate pmcc5.08b Linear coding: effect on pmcc5.08c Pearson: measure of straight-line fit |
| Answer | Marks | Guidance |
|---|---|---|
| (a) \(S_{yy} = 393 - \frac{61^2}{10} = \mathbf{20.9}\), \(S_{xy} = 382 - \frac{61 \times 60}{10} = \mathbf{16}\) | M1A1, A1 | (3) |
| (b) \([r] = \frac{"16"}{\sqrt{20.9 \times 28}} = 0.66140...\text{ awrt } \mathbf{0.661}\) | M1, A1 | (2) |
| (c) Researcher's belief suggests negative correlation, data suggests positive correlation. So data does not support researcher's belief | B1, dB1 | (2) |
| (d) New \(x\) equals \(\bar{x} = 6\), Since \(S_{xx} = \sum(x - \bar{x})^2\) the value of \(S_{xx}\) is the same = 28 | B1, dB1 | (2) |
| Answer | Marks | Guidance |
|---|---|---|
| (e) \(S_{xy} = \sum(x - \bar{x})(y - \bar{y}) = \sum(x - \bar{x})y\) so new term will be zero (since mean = x) and since \(S_{xx}\) increases. So \(r\) will decrease | B1, dB1 | (2) [11] |
**(a)** $S_{yy} = 393 - \frac{61^2}{10} = \mathbf{20.9}$, $S_{xy} = 382 - \frac{61 \times 60}{10} = \mathbf{16}$ | M1A1, A1 | (3) | M1 for correct expression for $S_{yy}$ or $S_{xy}$; 1st A1 for $S_{yy} = 20.9$; 2nd A1 for $S_{xy} = 16$
**(b)** $[r] = \frac{"16"}{\sqrt{20.9 \times 28}} = 0.66140...\text{ awrt } \mathbf{0.661}$ | M1, A1 | (2) | M1 for correct expression for $r$ – ft their 20.9 (provided it is > 0) and their 16. Use of 382 for 16 or 393 for 20.9 is M0; A1 for awrt 0.661
**(c)** Researcher's belief suggests **negative** correlation, data suggests **positive** correlation. So data does not support researcher's belief | B1, dB1 | (2) | 1st B1 for suitable reason contrasting belief with data. They must state sign (positive or negative) of correlation of data or belief and imply other is opposite; 2nd dB1 Dependent on correct reason for saying it does not support claim. e.g. State "does not support the belief because data has positive correlation" scores B1B1 BUT State "does support the belief because data has positive correlation" scores B0B0
**(d)** New $x$ equals $\bar{x} = 6$, Since $S_{xx} = \sum(x - \bar{x})^2$ the value of $S_{xx}$ is the same = 28 | B1, dB1 | (2) | 1st B1 for clearly stating new value of $x$ = (6 =) mean; 2nd dB1 Dep. on 1st B1 for reason that shows $S_{xx}$ unchanged e.g. extra term is 0 so $S_{xx}$ is same
**ALT** 1st B1 for seeing $\sum x = 66$ and new $\sum x^2 = 424$ (or 388 + 6²) and attempt at $S_{xx}$; 2nd B1 for showing $S_{xx} = 28$ with $n = 11$ and no incorrect working seen and final comment
**(e)** $S_{xy} = \sum(x - \bar{x})(y - \bar{y}) = \sum(x - \bar{x})y$ so new term will be zero (since mean = x) and since $S_{xx}$ increases. So $r$ will decrease | B1, dB1 | (2) [11] | 1st B1 for clear reason that mentions $S_{xy}$ is same **and** increase in $S_{yy}$. Saying $r$ increases or stays same is B0B0; 2nd dB1 Dependent on 1st B1 for saying $r$ will decrease.
---
5. A researcher believes that parents with a short family name tended to give their children a long first name. A random sample of 10 children was selected and the number of letters in their family name, $x$, and the number of letters in their first name, $y$, were recorded.
The data are summarised as:
$$\sum x = 60 , \quad \sum y = 61 , \quad \sum y ^ { 2 } = 393 , \quad \sum x y = 382 , \quad \mathrm {~S} _ { x x } = 28$$
\begin{enumerate}[label=(\alph*)]
\item Find $\mathrm { S } _ { y y }$ and $\mathrm { S } _ { x y }$
\item Calculate the product moment correlation coefficient, $r$, between $x$ and $y$.
\item State, giving a reason, whether or not these data support the researcher's belief.
The researcher decides to add a child with family name "Turner" to the sample.
\item Using the definition $\mathrm { S } _ { x x } = \sum ( x - \bar { x } ) ^ { 2 }$, state the new value of $\mathrm { S } _ { x x }$ giving a reason for your answer.
Given that the addition of the child with family name "Turner" to the sample leads to an increase in $\mathrm { S } _ { y y }$
\item use the definition $\mathrm { S } _ { x y } = \sum ( x - \bar { x } ) ( y - \bar { y } )$ to determine whether or not the value of $r$ will increase, decrease or stay the same. Give a reason for your answer.
\end{enumerate}
\hfill \mbox{\textit{Edexcel S1 2013 Q5 [11]}}