| Exam Board | OCR MEI |
|---|---|
| Module | S1 (Statistics 1) |
| Year | 2008 |
| Session | June |
| Marks | 20 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Data representation |
| Type | Estimate mean and standard deviation from histogram |
| Difficulty | Moderate -0.8 This is a standard S1 histogram question requiring routine skills: reading frequency density, calculating cumulative frequencies, finding median from grouped data, and understanding how changing class boundaries affects summary statistics. All techniques are textbook exercises with no novel problem-solving required, making it easier than average A-level maths. |
| Spec | 2.02a Interpret single variable data: tables and diagrams2.02b Histogram: area represents frequency2.02f Measures of average and spread2.02i Select/critique data presentation |
| Age | 20 | 30 | 40 | 50 | 65 | 100 |
| Cumulative frequency (thousands) | 660 | 1240 | 1810 | \(a\) | 2490 | 2770 |
| Age ( \(x\) years) | \(0 \leqslant x < 20\) | \(20 \leqslant x < 30\) | \(30 \leqslant x < 40\) | \(40 \leqslant x < 50\) | \(50 \leqslant x < 65\) | \(65 \leqslant x < 100\) |
| Frequency (thousands) | 1120 | 650 | 770 | 590 | 680 | 610 |
| Answer | Marks | Guidance |
|---|---|---|
| Positive | B1 | |
| Number of people \(= 20 \times 33\text{ (000)} + 5 \times 58\text{ (000)} = 660\text{ (000)} + 290\text{ (000)} = 950\text{ 000}\) | M1, M1(indep), A1cao | M1 first term; M1(indep) second term; A1cao; NB answer of 950 scores M2A0 |
| 3 marks total |
| Answer | Marks | Guidance |
|---|---|---|
| (A) \(a = 1810 + 340 = 2150\) | M1, A1cao | M1 for sum; A1cao 2150 or 2150 thousand but not 215000 |
| Answer | Marks | Guidance |
|---|---|---|
| Estimate median \(= (30) + \frac{145}{570} \times 10\) | M1 | M1 for attempt to interpolate \(\frac{145k}{570k} \times 10\) (2.54 or better suggests this); A1cao min 1dp |
| Median = 32.5 years (32.54...) If no working shown then 32.54 or better is needed to gain the M1A1. If 32.5 seen with no previous working allow SC1 | A1cao | 3 marks |
| Answer | Marks | Guidance |
|---|---|---|
| (accept 45.33 and 17.43 for 45 and 17) | B1 | B1 for any one correct; B1 for all correct (soi by listing or from histogram) |
| Note: all G marks below dep on attempt at frequency density, NOT frequency | ||
| G1 Linear scales on both axes (no inequalities); G1 Heights FT their listed ds (all must be correct. Also widths. All blocks joined | 5 marks total |
| Answer | Marks | Guidance |
|---|---|---|
| - Outer London has a more evenly spread distribution or balanced distribution (ages) o.e. | E1, E1 |
| Answer | Marks | Guidance |
|---|---|---|
| B1, B1, B1, B1 | Any one correct B1; Any two correct B2; Any three correct B3; All five correct B4 | 4 marks |
| TOTAL 20 marks |
## (i), (ii)
**Positive** | B1 | | 1 mark
Number of people $= 20 \times 33\text{ (000)} + 5 \times 58\text{ (000)} = 660\text{ (000)} + 290\text{ (000)} = 950\text{ 000}$ | M1, M1(indep), A1cao | M1 first term; M1(indep) second term; A1cao; NB answer of 950 scores M2A0
| | 3 marks total
## (iii)
**(A)** $a = 1810 + 340 = 2150$ | M1, A1cao | M1 for sum; A1cao 2150 or 2150 thousand but not 215000 | 2 marks
**(B)** Median = age of 1 385 (000th) person or 1385.5 (000)
Age 30, cf = 1 240 (000); age 40, cf = 1 810 (000)
Estimate median $= (30) + \frac{145}{570} \times 10$ | M1 | M1 for attempt to interpolate $\frac{145k}{570k} \times 10$ (2.54 or better suggests this); A1cao min 1dp
Median = 32.5 years (32.54...) If no working shown then 32.54 or better is needed to gain the M1A1. If 32.5 seen with no previous working allow SC1 | A1cao | 3 marks
## (iv)
Frequency densities: 56, 65, 77, 59, 45, 17
(accept 45.33 and 17.43 for 45 and 17) | B1 | B1 for any one correct; B1 for all correct (soi by listing or from histogram)
| | **Note:** all G marks below dep on attempt at frequency density, NOT frequency
**G1** Linear scales on both axes (no inequalities); G1 Heights FT their listed ds (all must be correct. **Also widths. All blocks joined** | | 5 marks total
**(v)** Any two suitable comments such as:
- Outer London has a greater proportion (or %) of people under 20 (or almost equal proportion)
- The modal group in Inner London is 20-30 but in Outer London it is 30-40
- Outer London has a greater proportion (14%) of aged 65+
- **All populations in each age group are higher in Outer London**
- Outer London has a more evenly spread distribution or balanced distribution (ages) o.e. | E1, E1 | | 2 marks
## (vi)
- Mean increase ↑
- median unchanged (-)
- midrange increase ↑
- standard deviation increase ↑
- interquartile range unchanged. (-)
| B1, B1, B1, B1 | Any one correct B1; Any two correct B2; Any three correct B3; **All five correct B4** | 4 marks
| | **TOTAL 20 marks**
7 The histogram shows the age distribution of people living in Inner London in 2001.\\
\includegraphics[max width=\textwidth, alt={}, center]{be764df3-ff20-415d-9c5c-10edabf350de-5_814_1383_349_379}
Data sourced from the 2001 Census, \href{http://www.statistics.gov.uk}{www.statistics.gov.uk}
\begin{enumerate}[label=(\roman*)]
\item State the type of skewness shown by the distribution.
\item Use the histogram to estimate the number of people aged under 25.
\item The table below shows the cumulative frequency distribution.
\begin{center}
\begin{tabular}{ | l | c | c | c | c | c | c | }
\hline
Age & 20 & 30 & 40 & 50 & 65 & 100 \\
\hline
Cumulative frequency (thousands) & 660 & 1240 & 1810 & $a$ & 2490 & 2770 \\
\hline
\end{tabular}
\end{center}
(A) Use the histogram to find the value of $a$.\\
(B) Use the table to calculate an estimate of the median age of these people.
The ages of people living in Outer London in 2001 are summarised below.
\begin{center}
\begin{tabular}{ | l | c | c | c | c | c | c | }
\hline
Age ( $x$ years) & $0 \leqslant x < 20$ & $20 \leqslant x < 30$ & $30 \leqslant x < 40$ & $40 \leqslant x < 50$ & $50 \leqslant x < 65$ & $65 \leqslant x < 100$ \\
\hline
Frequency (thousands) & 1120 & 650 & 770 & 590 & 680 & 610 \\
\hline
\end{tabular}
\end{center}
\item Illustrate these data by means of a histogram.
\item Make two brief comments on the differences between the age distributions of the populations of Inner London and Outer London.
\item The data given in the table for Outer London are used to calculate the following estimates.
Mean 38.5, median 35.7, midrange 50, standard deviation 23.7, interquartile range 34.4.\\
The final group in the table assumes that the maximum age of any resident is 100 years. These estimates are to be recalculated, based on a maximum age of 105, rather than 100. For each of the five estimates, state whether it would increase, decrease or be unchanged.
\end{enumerate}
\hfill \mbox{\textit{OCR MEI S1 2008 Q7 [20]}}