Edexcel S1 2014 June — Question 2 14 marks

Exam BoardEdexcel
ModuleS1 (Statistics 1)
Year2014
SessionJune
Marks14
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicData representation
TypeEstimate mean and standard deviation from frequency table
DifficultyModerate -0.8 This is a standard S1 statistics question covering routine grouped data calculations (mean, standard deviation, median via interpolation) and histogram construction. All required summations are provided, and the conceptual demands (understanding frequency density, linear interpolation, and the effect of adding a data point) are straightforward textbook material requiring no novel insight.
Spec2.02a Interpret single variable data: tables and diagrams2.02b Histogram: area represents frequency2.02f Measures of average and spread2.02g Calculate mean and standard deviation

  1. The table below shows the distances (to the nearest km ) travelled to work by the 50 employees in an office.
Distance (km)Frequency (f)Distance midpoint (x)
0-2161.25
3-5124
6-10108
11-20815.5
21-40430.5
$$\text { [You may use } \left. \sum \mathrm { f } x = 394 , \quad \sum \mathrm { f } x ^ { 2 } = 6500 \right]$$ A histogram has been drawn to represent these data.
The bar representing the distance of \(3 - 5\) has a width of 1.5 cm and a height of 6 cm .
  1. Calculate the width and height of the bar representing the distance of 6-10
  2. Use linear interpolation to estimate the median distance travelled to work.
    1. Show that an estimate of the mean distance travelled to work is 7.88 km .
    2. Estimate the standard deviation of the distances travelled to work.
  3. Describe, giving a reason, the skewness of these data. Peng starts to work in this office as the \(51 ^ { \text {st } }\) employee.
    She travels a distance of 7.88 km to work.
  4. Without carrying out any further calculations, state, giving a reason, what effect Peng's addition to the workforce would have on your estimates of the
    1. mean,
    2. median,
    3. standard deviation
      of the distances travelled to work.

AnswerMarks Guidance
(a) Width \(= \frac{3}{8} \times 1.5 = 2.5\) (cm)B1 (3)
Area \(= 6 \times 1.5 = 9\) cm² has frequency \(= 12\) so \(1.5\) cm² \(= 2\) people (o.e.). Frequency of 10 corresponds to area of 7.5 so height \(= 3\) (cm)M1, A1 For forming a relationship between area and no. of people or "their width" × "their height"= 7.5 or for \(\frac{3h}{10} = \frac{9}{12}\) oe. A1 for height of 3 (cm). NOTE: the common incorrect answer width = 3 and height = 2.5 scores B0M1A0
(b) \(Q_2 = [2.5+] \frac{(25/25.5-16)}{12} \times 3 = 4.75\) (or 4.875 if use \(n+1\)) awrt \(4.75\)M1 A1 (2)
(c)(i) \([\bar{x}=] \frac{394}{50} = 7.88\) (*)B1cso (4)
(c)(ii) \([\sigma_s =] \sqrt{\frac{6500}{50} - x^2} = \sqrt{67.9056}\) \(=\) awrt \(8.24\) (Accept \(s =\) awrt 8.32)M1A1, A1 For correct expression which must have 6500, 50 and 7.88. (square root not necessary for M1). 2nd A1 for awrt 8.24 (use of \(s =\) awrt 8.32). Condone incorrect labelling if awrt 8.24 is found.
(d) \(\bar{x} > Q_2\)B1ft, dB1 (2) 1st B1ft for correct comparison of \(\bar{x} = 7.88\) and their \(Q_2\) (this may be seen embedded in another formula i.e. 3(mean-median)/s.d.). \(Q_3 - Q_2 > Q_2 - Q_1\) is B0 unless \(Q_1\) and \(Q_3\) have been found. (\(Q_1 = 1.95/1.99, Q_3 = 10.25/10.81\)). So positive (skew). 2nd dB1 Dependent on the 1st B1 and for concluding "positive" skew. Note: if their \(Q_2 > 7.88\), then B0. Positive correlation is B0.
(e)(i) There is no effect on the meanB1 (3) [14]
(e)(ii) The median will increaseB1
(e)(iii) The standard deviation will decreaseB1
**(a)** Width $= \frac{3}{8} \times 1.5 = 2.5$ (cm) | B1 (3) |
Area $= 6 \times 1.5 = 9$ cm² has frequency $= 12$ so $1.5$ cm² $= 2$ people (o.e.). Frequency of 10 corresponds to area of 7.5 so height $= 3$ (cm) | M1, A1 | For forming a relationship between area and no. of people or "their width" × "their height"= 7.5 or for $\frac{3h}{10} = \frac{9}{12}$ oe. A1 for height of 3 (cm). NOTE: the common incorrect answer width = 3 and height = 2.5 scores B0M1A0

**(b)** $Q_2 = [2.5+] \frac{(25/25.5-16)}{12} \times 3 = 4.75$ (or 4.875 if use $n+1$) awrt $4.75$ | M1 A1 (2) |

**(c)(i)** $[\bar{x}=] \frac{394}{50} = 7.88$ (*) | B1cso (4) |
**(c)(ii)** $[\sigma_s =] \sqrt{\frac{6500}{50} - x^2} = \sqrt{67.9056}$ $=$ awrt $8.24$ (Accept $s =$ awrt 8.32) | M1A1, A1 | For correct expression which must have 6500, 50 and 7.88. (square root not necessary for M1). 2nd A1 for awrt 8.24 (use of $s =$ awrt 8.32). Condone incorrect labelling if awrt 8.24 is found.

**(d)** $\bar{x} > Q_2$ | B1ft, dB1 (2) | 1st B1ft for correct comparison of $\bar{x} = 7.88$ and their $Q_2$ (this may be seen embedded in another formula i.e. 3(mean-median)/s.d.). $Q_3 - Q_2 > Q_2 - Q_1$ is B0 unless $Q_1$ and $Q_3$ have been found. ($Q_1 = 1.95/1.99, Q_3 = 10.25/10.81$). So positive (skew). 2nd dB1 Dependent on the 1st B1 and for concluding "positive" skew. Note: if their $Q_2 > 7.88$, then B0. Positive correlation is B0.

**(e)(i)** There is no effect on the mean | B1 (3) | [14]
**(e)(ii)** The median will increase | B1 |
**(e)(iii)** The standard deviation will decrease | B1 |

---
\begin{enumerate}
  \item The table below shows the distances (to the nearest km ) travelled to work by the 50 employees in an office.
\end{enumerate}

\begin{center}
\begin{tabular}{|l|l|l|}
\hline
Distance (km) & Frequency (f) & Distance midpoint (x) \\
\hline
0-2 & 16 & 1.25 \\
\hline
3-5 & 12 & 4 \\
\hline
6-10 & 10 & 8 \\
\hline
11-20 & 8 & 15.5 \\
\hline
21-40 & 4 & 30.5 \\
\hline
\end{tabular}
\end{center}

$$\text { [You may use } \left. \sum \mathrm { f } x = 394 , \quad \sum \mathrm { f } x ^ { 2 } = 6500 \right]$$

A histogram has been drawn to represent these data.\\
The bar representing the distance of $3 - 5$ has a width of 1.5 cm and a height of 6 cm .\\
(a) Calculate the width and height of the bar representing the distance of 6-10\\
(b) Use linear interpolation to estimate the median distance travelled to work.\\
(c) (i) Show that an estimate of the mean distance travelled to work is 7.88 km .\\
(ii) Estimate the standard deviation of the distances travelled to work.\\
(d) Describe, giving a reason, the skewness of these data.

Peng starts to work in this office as the $51 ^ { \text {st } }$ employee.\\
She travels a distance of 7.88 km to work.\\
(e) Without carrying out any further calculations, state, giving a reason, what effect Peng's addition to the workforce would have on your estimates of the\\
(i) mean,\\
(ii) median,\\
(iii) standard deviation\\
of the distances travelled to work.

\hfill \mbox{\textit{Edexcel S1 2014 Q2 [14]}}