| Exam Board | OCR MEI |
|---|---|
| Module | S1 (Statistics 1) |
| Marks | 20 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Data representation |
| Type | Describe shape or skewness of distribution |
| Difficulty | Moderate -0.8 This is a multi-part question testing basic interpretation of histograms and summary statistics. Parts (i), (ii), and (iii) involve straightforward reading from a histogram and cumulative frequency table. Part (iv) requires drawing a histogram from grouped data. Parts (v) and (vi) test conceptual understanding of distribution comparison and how changing class boundaries affects summary statistics. All tasks are routine S1 content requiring recall and standard procedures rather than problem-solving or novel insight. |
| Spec | 2.02b Histogram: area represents frequency2.02f Measures of average and spread |
| Age | 20 | 30 | 40 | 50 | 65 | 100 |
| Cumulative frequency (thousands) | 660 | 1240 | 1810 | \(a\) | 2490 | 2770 |
| Age ( \(x\) years) | \(0 \leqslant x < 20\) | \(20 \leqslant x < 30\) | \(30 \leqslant x < 40\) | \(40 \leqslant x < 50\) | \(50 \leqslant x < 65\) | \(65 \leqslant x < 100\) |
| Frequency (thousands) | 1120 | 650 | 770 | 590 | 680 | 610 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| Positive | B1 [1] |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| Number of people \(= 20\times33\,000 + 5\times58\,000 = 660\,000 + 290\,000 = 950\,000\) | M1, M1, A1 [3] | M1 first term; M1 (indep) second term; A1 CAO; NB answer of 950 scores M2A0 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| \((A)\quad a = 1810 + 340 = 2150\) | M1, A1 [2] | M1 for sum; A1 CAO 2150 or 2150 thousand but not 215000 |
| \((B)\) Median = age of \(1\,385\,000\)th person; Age 30, cf \(= 1\,240\,000\); Age 40, cf \(= 1\,810\,000\); Estimate median \(= 30 + \dfrac{145}{570}\times10 = 32.5\) years | B1, M1, A1 [3] | B1 for \(1\,385\,000\) or 1385.5; M1 for attempt to interpolate \(\dfrac{145k}{570k}\times10\); A1 CAO min 1dp |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| Frequency densities: 56, 65, 77, 59, 45, 17 (accept 45.33 and 17.43 for 45 and 17) | B1, B1 | B1 for any one correct; B1 for all correct |
| Histogram drawn correctly | G1, G1, G1 [5] | G1 linear scales both axes; G1 heights FT their listed fds, also widths, all blocks joined; G1 appropriate label for vertical scale e.g. 'Frequency density (thousands)'; Note: all G marks dep on attempt at frequency density, NOT frequency |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| Any two suitable comments, e.g.: Outer London has greater proportion of people under 20; modal group in Inner London is 20–30 but Outer London is 30–40; Outer London has greater proportion (14%) aged 65+; all populations in each age group higher in Outer London; Outer London has more evenly spread/balanced distribution | E1, E1 [2] |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| Mean increase \(\uparrow\); median unchanged \((-)\); midrange increase \(\uparrow\); standard deviation increase \(\uparrow\); interquartile range unchanged \((-)\) | B1–B4 [4] | Any one correct B1; any two correct B2; any three correct B3; all five correct B4 |
# Question 3:
## Part (i)
| Answer | Marks | Guidance |
|--------|-------|----------|
| Positive | B1 [1] | |
## Part (ii)
| Answer | Marks | Guidance |
|--------|-------|----------|
| Number of people $= 20\times33\,000 + 5\times58\,000 = 660\,000 + 290\,000 = 950\,000$ | M1, M1, A1 [3] | M1 first term; M1 (indep) second term; A1 CAO; NB answer of 950 scores M2A0 |
## Part (iii)
| Answer | Marks | Guidance |
|--------|-------|----------|
| $(A)\quad a = 1810 + 340 = 2150$ | M1, A1 [2] | M1 for sum; A1 CAO 2150 or 2150 thousand but not 215000 |
| $(B)$ Median = age of $1\,385\,000$th person; Age 30, cf $= 1\,240\,000$; Age 40, cf $= 1\,810\,000$; Estimate median $= 30 + \dfrac{145}{570}\times10 = 32.5$ years | B1, M1, A1 [3] | B1 for $1\,385\,000$ or 1385.5; M1 for attempt to interpolate $\dfrac{145k}{570k}\times10$; A1 CAO min 1dp |
## Part (iv)
| Answer | Marks | Guidance |
|--------|-------|----------|
| Frequency densities: 56, 65, 77, 59, 45, 17 (accept 45.33 and 17.43 for 45 and 17) | B1, B1 | B1 for any one correct; B1 for all correct |
| Histogram drawn correctly | G1, G1, G1 [5] | G1 linear scales both axes; G1 heights FT their listed fds, also widths, all blocks joined; G1 appropriate label for vertical scale e.g. 'Frequency density (thousands)'; **Note: all G marks dep on attempt at frequency density, NOT frequency** |
## Part (v)
| Answer | Marks | Guidance |
|--------|-------|----------|
| Any two suitable comments, e.g.: Outer London has greater proportion of people under 20; modal group in Inner London is 20–30 but Outer London is 30–40; Outer London has greater proportion (14%) aged 65+; all populations in each age group higher in Outer London; Outer London has more evenly spread/balanced distribution | E1, E1 [2] | |
## Part (vi)
| Answer | Marks | Guidance |
|--------|-------|----------|
| Mean increase $\uparrow$; median unchanged $(-)$; midrange increase $\uparrow$; standard deviation increase $\uparrow$; interquartile range unchanged $(-)$ | B1–B4 [4] | Any one correct B1; any two correct B2; any three correct B3; all five correct B4 |
3 The histogram shows the age distribution of people living in Inner London in 2001.\\
\includegraphics[max width=\textwidth, alt={}, center]{b6d84f99-ee39-49c7-a5e8-05838efeef5a-2_804_1372_483_436}
Data sourced from the 2001 Census, www.sta is \href{http://ics.gov.uk}{ics.gov.uk}
\begin{enumerate}[label=(\roman*)]
\item State the type of skewness shown by the distribution.
\item Use the histogram to estimate the number of people aged under 25.
\item The table below shows the cumulative frequency distribution.
\begin{center}
\begin{tabular}{ | l | c | c | c | c | c | c | }
\hline
Age & 20 & 30 & 40 & 50 & 65 & 100 \\
\hline
Cumulative frequency (thousands) & 660 & 1240 & 1810 & $a$ & 2490 & 2770 \\
\hline
\end{tabular}
\end{center}
(A) Use the histogram to find the value of $a$.\\
(B) Use the table to calculate an estimate of the median age of these people.
The ages of people living in Outer London in 2001 are summarised below.
\begin{center}
\begin{tabular}{ | l | c | c | c | c | c | c | }
\hline
Age ( $x$ years) & $0 \leqslant x < 20$ & $20 \leqslant x < 30$ & $30 \leqslant x < 40$ & $40 \leqslant x < 50$ & $50 \leqslant x < 65$ & $65 \leqslant x < 100$ \\
\hline
Frequency (thousands) & 1120 & 650 & 770 & 590 & 680 & 610 \\
\hline
\end{tabular}
\end{center}
\item Illustrate these data by means of a histogram.
\item Make two brief comments on the differences between the age distributions of the populations of Inner London and Outer London.
\item The data given in the table for Outer London are used to calculate the following estimates.
Mean 38.5, median 35.7, midrange 50, standard deviation 23.7, interquartile range 34.4.\\
The final group in the table assumes that the maximum age of any resident is 100 years. These estimates are to be recalculated, based on a maximum age of 105, rather than 100. For each of the five estimates, state whether it would increase, decrease or be unchanged.
\end{enumerate}
\hfill \mbox{\textit{OCR MEI S1 Q3 [20]}}