Edexcel S1 — Question 4 11 marks

Exam BoardEdexcel
ModuleS1 (Statistics 1)
Marks11
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicData representation
TypeUse linear interpolation for median or quartiles
DifficultyStandard +0.3 This is a standard S1 linear interpolation question requiring cumulative frequency work to find median and percentiles from grouped data. Part (c) adds a straightforward interpretation about skewness. Slightly easier than average due to being a routine textbook exercise with clear method, though the unequal class widths require some care.
Spec2.02f Measures of average and spread2.02g Calculate mean and standard deviation

4. The ages of 300 houses in a village are recorded giving the following table of results.
Age (years)Number of houses
0 -36
20 -92
40 -74
60 -39
100 -14
200 -27
300-50018
Use linear interpolation to estimate for these data
  1. the median,
  2. the limits between which the middle \(80 \%\) of the ages lie. An estimate of the mean of these data is calculated to be 86.6 years.
  3. Explain why the mean and median are so different and hence say which you consider best represents the data.

AnswerMarks Guidance
(a) Cumulative frequencies: 36, 128, 202, 241, 255, 282, 300 and median \(= 150^{\text{th}} = 40 + 20(\frac{23}{74}) = 45.9\) [150.5th → 46.1]M1 M1 A1
(b) Middle 80% is \(P_{10}\) to \(P_{90}\); \(P_{10} = 30^{\text{th}} = 20(\frac{30}{36}) = 16.7\) [30.1th → 16.7] and \(P_{90} = 270^{\text{th}} = 200 + 100(\frac{15}{27}) = 255.6\) [270.9th → 258.9]; therefore limits are 17 and 256 yearsM1 M1 A2
(c) e.g. data v. skewed, some extremely high values; doesn't affect median but increases mean significantly; median better, most values below the meanB2 B1 Total 11 marks
**(a)** Cumulative frequencies: 36, 128, 202, 241, 255, 282, 300 and median $= 150^{\text{th}} = 40 + 20(\frac{23}{74}) = 45.9$ [150.5th → 46.1] | M1 M1 A1 |

**(b)** Middle 80% is $P_{10}$ to $P_{90}$; $P_{10} = 30^{\text{th}} = 20(\frac{30}{36}) = 16.7$ [30.1th → 16.7] and $P_{90} = 270^{\text{th}} = 200 + 100(\frac{15}{27}) = 255.6$ [270.9th → 258.9]; therefore limits are 17 and 256 years | M1 M1 A2 |

**(c)** e.g. data v. skewed, some extremely high values; doesn't affect median but increases mean significantly; median better, most values below the mean | B2 B1 | Total 11 marks

---
4. The ages of 300 houses in a village are recorded giving the following table of results.

\begin{center}
\begin{tabular}{|l|l|}
\hline
Age (years) & Number of houses \\
\hline
0 - & 36 \\
\hline
20 - & 92 \\
\hline
40 - & 74 \\
\hline
60 - & 39 \\
\hline
100 - & 14 \\
\hline
200 - & 27 \\
\hline
300-500 & 18 \\
\hline
\end{tabular}
\end{center}

Use linear interpolation to estimate for these data
\begin{enumerate}[label=(\alph*)]
\item the median,
\item the limits between which the middle $80 \%$ of the ages lie.

An estimate of the mean of these data is calculated to be 86.6 years.
\item Explain why the mean and median are so different and hence say which you consider best represents the data.
\end{enumerate}

\hfill \mbox{\textit{Edexcel S1  Q4 [11]}}