| Exam Board | Edexcel |
|---|---|
| Module | S1 (Statistics 1) |
| Year | 2019 |
| Session | January |
| Marks | 18 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Bivariate data |
| Type | Calculate r from summary statistics |
| Difficulty | Moderate -0.3 This is a standard S1 correlation/regression question requiring routine calculations from given summary statistics. While multi-part with several steps (finding means, Sxy, correlation coefficient r, and regression line), all techniques are straightforward applications of formulas with no conceptual challenges or novel problem-solving required. Slightly easier than average due to being highly procedural. |
| Spec | 2.02c Scatter diagrams and regression lines2.02d Informal interpretation of correlation2.02f Measures of average and spread2.02g Calculate mean and standard deviation5.08a Pearson correlation: calculate pmcc5.09a Dependent/independent variables5.09c Calculate regression line |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| Mean, median, average, marks, results score: on P2 (\(y\)) is lower than P1(\(x\)) | B1 | One of these 5 terms seen for 1st B1 |
| Spread, dispersion, range, st. dev, var(iance): on P2 is more than P1 | B1 | (2) |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| e.g. (38, 0) doesn't follow the pattern/trend or out of range of other points, or far from (best fit) line / other points | B1 | Suitable explanation; saying "extreme point" is B0 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| The student was absent when paper 2 was taken | B1 | e.g. teacher didn't mark it, wrongly recorded/plotted (2) |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| New \(\bar{x} = \frac{35.75 \times 16 - 38}{15} = \frac{534}{15} = \textbf{35.6}\) | M1, A1 | M1 for correct method; list requires \(\Sigma x = 534\) and \(\div 15\); A1 for 35.6 or e.g. \(35\frac{3}{5}\) |
| New \(\bar{y} = \frac{25.75 \times 16}{15} = 27.4\dot{6}\) awrt 27.5 (allow \(\frac{412}{15}\)) | B1 | B1 for awrt 27.5 (3) |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| New \(\sum xy = 15837 - 38 \times 0\) so no change | B1 | For explanation with sight of "\(38 \times 0\)"; e.g. for (38, 0) or omitted point, \(xy = 0\) |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| \(S_{xy} = 15837 - \frac{(35.75 \times 16 - 38)(25.75 \times 16)}{15}\) or \(-\frac{\text{"534"} \times \text{"412"}}{15}\) or \(-\frac{220008}{15}\) | M1 | For a correct expression (can ft their 534 and their 412 if stated in (c)) |
| \(= \textbf{1169.8}\) (*) | A1cso | Dependent on M1 with no incorrect working; may be seen in (e) (3) |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| \(r = \frac{1169.8}{\sqrt{965.6 \times 1561.7}} = 0.9526079\ldots\) awrt 0.953 | M1, A1 | M1 for correct method (implied by ans = awrt 0.95); A1 for awrt 0.953 (2) |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| \(b = \frac{1169.8}{965.6}\ [= 1.21147\ldots]\) | M1 | 1st M1 for correct expression for \(b\) |
| \(a = \text{"27.5"} - \text{"b"} \times \text{"35.6"}\ [= -15.6618\ldots]\) | M1 | 2nd M1 for correct expression for \(a\) (ft means in (c)) |
| \(y = -15.6/7 + 1.2x\); \(b\) = awrt 1.2, \(a\) = awrt \(-15.6\) or \(-15.7\) | A1, A1 | \(a\) and \(b\) must be in an \(x, y\) equation; 1st A1 for \(b\), 2nd A1 for \(a\) (4) |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| Value of \(r\) increased from 0.746 to 0.953 so points lie closer to a straight line | B1 | Suitable comment e.g. linear relationship stronger or stronger linear correlation (1) |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| \(y = \text{"1.21..."} \times 38 - \text{"15.66..."}\) or awrt 30 | B1ft | ft for awrt 30 or ft expression using \(x = 38\) in their equation (need not be evaluated) (1) |
# Question 6:
## Part (a)
| Answer | Marks | Guidance |
|--------|-------|----------|
| Mean, median, average, marks, results score: on P2 ($y$) is lower than P1($x$) | B1 | One of these 5 terms seen for 1st B1 |
| Spread, dispersion, range, st. dev, var(iance): on P2 is more than P1 | B1 | **(2)** |
## Part (b)(i)
| Answer | Marks | Guidance |
|--------|-------|----------|
| e.g. (38, 0) doesn't follow the pattern/trend or out of range of other points, or far from (best fit) line / other points | B1 | Suitable explanation; saying "extreme point" is B0 |
## Part (b)(ii)
| Answer | Marks | Guidance |
|--------|-------|----------|
| The student was absent when paper 2 was taken | B1 | e.g. teacher didn't mark it, wrongly recorded/plotted **(2)** |
## Part (c)
| Answer | Marks | Guidance |
|--------|-------|----------|
| New $\bar{x} = \frac{35.75 \times 16 - 38}{15} = \frac{534}{15} = \textbf{35.6}$ | M1, A1 | M1 for correct method; list requires $\Sigma x = 534$ and $\div 15$; A1 for 35.6 or e.g. $35\frac{3}{5}$ |
| New $\bar{y} = \frac{25.75 \times 16}{15} = 27.4\dot{6}$ awrt **27.5** (allow $\frac{412}{15}$) | B1 | B1 for awrt 27.5 **(3)** |
## Part (d)(i)
| Answer | Marks | Guidance |
|--------|-------|----------|
| New $\sum xy = 15837 - 38 \times 0$ so no change | B1 | For explanation with sight of "$38 \times 0$"; e.g. for (38, 0) or omitted point, $xy = 0$ |
## Part (d)(ii)
| Answer | Marks | Guidance |
|--------|-------|----------|
| $S_{xy} = 15837 - \frac{(35.75 \times 16 - 38)(25.75 \times 16)}{15}$ or $-\frac{\text{"534"} \times \text{"412"}}{15}$ or $-\frac{220008}{15}$ | M1 | For a correct expression (can ft their 534 and their 412 if stated in (c)) |
| $= \textbf{1169.8}$ (*) | A1cso | Dependent on M1 with no incorrect working; may be seen in (e) **(3)** |
## Part (e)
| Answer | Marks | Guidance |
|--------|-------|----------|
| $r = \frac{1169.8}{\sqrt{965.6 \times 1561.7}} = 0.9526079\ldots$ awrt **0.953** | M1, A1 | M1 for correct method (implied by ans = awrt 0.95); A1 for awrt 0.953 **(2)** |
## Part (f)
| Answer | Marks | Guidance |
|--------|-------|----------|
| $b = \frac{1169.8}{965.6}\ [= 1.21147\ldots]$ | M1 | 1st M1 for correct expression for $b$ |
| $a = \text{"27.5"} - \text{"b"} \times \text{"35.6"}\ [= -15.6618\ldots]$ | M1 | 2nd M1 for correct expression for $a$ (ft means in (c)) |
| $y = -15.6/7 + 1.2x$; $b$ = awrt **1.2**, $a$ = awrt **$-15.6$ or $-15.7$** | A1, A1 | $a$ and $b$ must be in an $x, y$ equation; 1st A1 for $b$, 2nd A1 for $a$ **(4)** |
## Part (g)
| Answer | Marks | Guidance |
|--------|-------|----------|
| Value of $r$ increased from 0.746 to 0.953 so points lie closer to a straight line | B1 | Suitable comment e.g. linear relationship **stronger** or stronger linear correlation **(1)** |
## Part (h)
| Answer | Marks | Guidance |
|--------|-------|----------|
| $y = \text{"1.21..."} \times 38 - \text{"15.66..."}$ or awrt **30** | B1ft | ft for awrt 30 or ft expression using $x = 38$ in their equation (need not be evaluated) **(1)** |
**[18 marks total]**
\begin{enumerate}
\item Following some school examinations, Chetna is studying the results of the 16 students in her class. The mark for paper $1 , x$, and the mark for paper $2 , y$, for each student are summarised in the following statistics.
\end{enumerate}
$$\bar { x } = 35.75 \quad \bar { y } = 25.75 \quad \sigma _ { x } = 7.79 \quad \sigma _ { y } = 11.91 \quad \sum x y = 15837$$
(a) Comment on the differences between the marks of the students on paper 1 and paper 2
Chetna decides to examine these data in more detail and plots the marks for each of the 16 students on the scatter diagram opposite.\\
(b) (i) Explain why the circled point $( 38,0 )$ is possibly an outlier.\\
(ii) Suggest a possible reason for this result.
Chetna decides to omit the data point $( 38,0 )$ and examine the other 15 students' marks.\\
(c) Find the value of $\bar { x }$ and the value of $\bar { y }$ for these 15 students.
For these 15 students\\
(d) (i) explain why $\sum x y$ is still 15837\\
(ii) show that $\mathrm { S } _ { x y } = 1169.8$
For these 15 students, Chetna calculates $\mathrm { S } _ { x x } = 965.6$ and $\mathrm { S } _ { y y } = 1561.7$ correct to 1 decimal place.\\
(e) Calculate the product moment correlation coefficient for these 15 students.\\
(f) Calculate the equation of the line of regression of $y$ on $x$ for these 15 students, giving your answer in the form $y = a + b x$
The product moment correlation coefficient between $x$ and $y$ for all 16 students is 0.746\\
(g) Explain how your calculation in part (e) supports Chetna's decision to omit the point $( 38,0 )$ before calculating the equation of the linear regression line.\\
(1)\\
(h) Estimate the mark in the second paper for a student who scored 38 marks in the first paper.
\begin{center}
\includegraphics[max width=\textwidth, alt={}]{d3f4450d-60eb-49b6-be1b-d2fcfad0451f-17_1127_1146_301_406}
\end{center}
\begin{center}
\includegraphics[max width=\textwidth, alt={}]{d3f4450d-60eb-49b6-be1b-d2fcfad0451f-20_2630_1828_121_121}
\end{center}
\hfill \mbox{\textit{Edexcel S1 2019 Q6 [18]}}