Edexcel S1 2009 January — Question 5 16 marks

Exam BoardEdexcel
ModuleS1 (Statistics 1)
Year2009
SessionJanuary
Marks16
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicData representation
TypeCalculate using histogram bar dimensions
DifficultyStandard +0.3 This is a standard S1 histogram question testing routine procedures: calculating frequency density for bar dimensions, linear interpolation for median/IQR, and mean/SD from grouped data. All techniques are textbook exercises requiring careful arithmetic but no problem-solving insight. Slightly easier than average due to being entirely procedural.
Spec2.02b Histogram: area represents frequency2.02f Measures of average and spread2.02g Calculate mean and standard deviation2.02h Recognize outliers

5. In a shopping survey a random sample of 104 teenagers were asked how many hours, to the nearest hour, they spent shopping in the last month. The results are summarised in the table below.
Number of hoursMid-pointFrequency
0-52.7520
6-76.516
8-10918
11-151325
16-2520.515
26-503810
A histogram was drawn and the group ( \(8 - 10\) ) hours was represented by a rectangle that was 1.5 cm wide and 3 cm high.
  1. Calculate the width and height of the rectangle representing the group (16-25) hours.
  2. Use linear interpolation to estimate the median and interquartile range.
  3. Estimate the mean and standard deviation of the number of hours spent shopping.
  4. State, giving a reason, the skewness of these data.
  5. State, giving a reason, which average and measure of dispersion you would recommend to use to summarise these data.

Question 5:
(a)
8–10 hours: width \(= 10.5 - 7.5 = 3\), represented by 1.5 cm
AnswerMarks Guidance
16–25 hours: width \(= 25.5 - 15.5 = 10\), represented by 5 cmB1, M1 B1 for attempting both frequency densities \(\frac{18}{3}(=6)\) and \(\frac{15}{10}\), and \(\frac{15}{10} \times \text{SF}\) where \(\text{SF} \neq 1\)
8–10 hours: height \(= \text{fd} = 18/3 = 6\), represented by 3 cm
AnswerMarks
16–25 hours: height \(= \text{fd} = 15/10 = 1.5\), represented by 0.75 cmA1 (3)
(b)
AnswerMarks Guidance
\(Q_2 = 7.5 + \frac{(52-36)}{18} \times 3 = 10.2\)M1, A1 M1 for identifying correct interval and correct fraction. 1st A1 for 10.2 (using \(n+1\) allow AWRT 10.3)
\(Q_1 = 5.5 + \frac{(26-20)}{16} \times 2 = 6.25\) or \(6.3\), or \(5.5 + \frac{(26.25-20)}{16} \times 2 [=6.3]\)A1 2nd A1 for correct expression for either \(Q_1\) or \(Q_3\)
\(Q_3 = 10.5 + \frac{(78-54)}{25} \times 5 [=15.3]\) or \(10.5 + \frac{(78.75-54)}{25} \times 5 [=15.45 \approx 15.5]\)A1, A1ft 3rd A1 for correct expressions for both \(Q_1\) and \(Q_3\); 4th A1ft for IQR, ft their quartiles
\(\text{IQR} = (15.3 - 6.3) = 9\)(5)
(c)
AnswerMarks Guidance
\(\sum fx = 1333.5 \Rightarrow \bar{x} = \frac{1333.5}{104}\) AWRT 12.8M1 A1 1st M1 for attempting \(\sum fx\) and \(\bar{x}\)
\(\sum fx^2 = 27254 \Rightarrow \sigma_x = \sqrt{\frac{27254}{104} - \bar{x}^2} = \sqrt{262.05 - \bar{x}^2}\) AWRT 9.88M1 A1 (4) 2nd M1 for attempting \(\sum fx^2\) and \(\sigma_x\); \(\sqrt{\phantom{x}}\) is needed for M1. Allow \(s =\) AWRT 9.93
(d)
AnswerMarks Guidance
\(Q_3 - Q_2 [=5.1] > Q_2 - Q_1 [=3.9]\) or \(Q_2 < \bar{x}\)B1ft 1st B1ft for suitable test; values need not be seen but statement must be compatible with values used
So data is positively skewdB1 (2) 2nd dB1 dependent on test showing positive skew and for stating positive skew. If test shows negative skew score 1st B1 but lose 2nd
(e)
AnswerMarks Guidance
Use median and IQR, since data is skewed or not affected by extreme values or outliersB1, B1 (2) 1st B1 for choosing median and IQR — must mention both; 2nd B1 for suitable reason. "Use median because data is skewed" scores B0B1 since IQR not mentioned
## Question 5:

**(a)**
8–10 hours: width $= 10.5 - 7.5 = 3$, represented by 1.5 cm

16–25 hours: width $= 25.5 - 15.5 = 10$, represented by 5 cm | B1, M1 | B1 for attempting both frequency densities $\frac{18}{3}(=6)$ and $\frac{15}{10}$, and $\frac{15}{10} \times \text{SF}$ where $\text{SF} \neq 1$

8–10 hours: height $= \text{fd} = 18/3 = 6$, represented by 3 cm

16–25 hours: height $= \text{fd} = 15/10 = 1.5$, represented by 0.75 cm | A1 (3) |

**(b)**
$Q_2 = 7.5 + \frac{(52-36)}{18} \times 3 = 10.2$ | M1, A1 | M1 for identifying correct interval and correct fraction. 1st A1 for 10.2 (using $n+1$ allow AWRT 10.3)

$Q_1 = 5.5 + \frac{(26-20)}{16} \times 2 = 6.25$ or $6.3$, or $5.5 + \frac{(26.25-20)}{16} \times 2 [=6.3]$ | A1 | 2nd A1 for correct expression for either $Q_1$ or $Q_3$

$Q_3 = 10.5 + \frac{(78-54)}{25} \times 5 [=15.3]$ or $10.5 + \frac{(78.75-54)}{25} \times 5 [=15.45 \approx 15.5]$ | A1, A1ft | 3rd A1 for correct expressions for both $Q_1$ and $Q_3$; 4th A1ft for IQR, ft their quartiles

$\text{IQR} = (15.3 - 6.3) = 9$ | (5) |

**(c)**
$\sum fx = 1333.5 \Rightarrow \bar{x} = \frac{1333.5}{104}$ AWRT 12.8 | M1 A1 | 1st M1 for attempting $\sum fx$ and $\bar{x}$

$\sum fx^2 = 27254 \Rightarrow \sigma_x = \sqrt{\frac{27254}{104} - \bar{x}^2} = \sqrt{262.05 - \bar{x}^2}$ AWRT 9.88 | M1 A1 (4) | 2nd M1 for attempting $\sum fx^2$ and $\sigma_x$; $\sqrt{\phantom{x}}$ is needed for M1. Allow $s =$ AWRT 9.93

**(d)**
$Q_3 - Q_2 [=5.1] > Q_2 - Q_1 [=3.9]$ or $Q_2 < \bar{x}$ | B1ft | 1st B1ft for suitable test; values need not be seen but statement must be compatible with values used

So data is positively skew | dB1 (2) | 2nd dB1 dependent on test showing positive skew and for stating positive skew. If test shows negative skew score 1st B1 but lose 2nd

**(e)**
Use median and IQR, since data is skewed or not affected by extreme values or outliers | B1, B1 (2) | 1st B1 for choosing median and IQR — must mention both; 2nd B1 for suitable reason. "Use median because data is skewed" scores B0B1 since IQR not mentioned
5. In a shopping survey a random sample of 104 teenagers were asked how many hours, to the nearest hour, they spent shopping in the last month. The results are summarised in the table below.

\begin{center}
\begin{tabular}{|l|l|l|}
\hline
Number of hours & Mid-point & Frequency \\
\hline
0-5 & 2.75 & 20 \\
\hline
6-7 & 6.5 & 16 \\
\hline
8-10 & 9 & 18 \\
\hline
11-15 & 13 & 25 \\
\hline
16-25 & 20.5 & 15 \\
\hline
26-50 & 38 & 10 \\
\hline
\end{tabular}
\end{center}

A histogram was drawn and the group ( $8 - 10$ ) hours was represented by a rectangle that was 1.5 cm wide and 3 cm high.
\begin{enumerate}[label=(\alph*)]
\item Calculate the width and height of the rectangle representing the group (16-25) hours.
\item Use linear interpolation to estimate the median and interquartile range.
\item Estimate the mean and standard deviation of the number of hours spent shopping.
\item State, giving a reason, the skewness of these data.
\item State, giving a reason, which average and measure of dispersion you would recommend to use to summarise these data.
\end{enumerate}

\hfill \mbox{\textit{Edexcel S1 2009 Q5 [16]}}