| Exam Board | OCR MEI |
|---|---|
| Module | S1 (Statistics 1) |
| Marks | 4 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Data representation |
| Type | Describe shape or skewness of distribution |
| Difficulty | Moderate -0.8 This is a multi-part question testing basic histogram interpretation skills: identifying skewness by inspection, reading frequencies from bars, using cumulative frequency for median, and understanding how changing class boundaries affects summary statistics. All parts involve routine recall and standard procedures with no problem-solving or novel insight required. The most demanding part (vi) requires understanding which statistics depend on class boundaries, but this is still straightforward conceptual knowledge. |
| Spec | 2.02a Interpret single variable data: tables and diagrams2.02b Histogram: area represents frequency2.02f Measures of average and spread |
| Age | 20 | 30 | 40 | 50 | 65 | 100 |
| Cumulative frequency (thousands) | 660 | 1240 | 1810 | \(a\) | 2490 | 2770 |
| Age ( \(x\) years) | \(0 \leqslant x < 20\) | \(20 \leqslant x < 30\) | \(30 \leqslant x < 40\) | \(40 \leqslant x < 50\) | \(50 \leqslant x < 65\) | \(65 \leqslant x < 100\) |
| Frequency (thousands) | 1120 | 650 | 770 | 590 | 680 | 610 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| Positive | B1 | |
| 1 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| Number of people \(= 20\times33(000)+5\times58(000)\) | M1 | M1 first term; M1(indep) second term |
| \(= 660(000)+290(000) = 950{,}000\) | A1 | NB answer of 950 scores M2A0 |
| 3 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| \(a = 1810+340 = 2150\) | M1, A1 | cao 2150 or 2150 thousand (but not 215000) |
| Median = age of \(1{,}385(000)^{\text{th}}\) person or \(1385.5(000)\) | B1 | B1 for \(1{,}385(000)\) or \(1385.5\) |
| Age 30, cf \(=1{,}240(000)\); age 40, cf \(=1{,}810(000)\) | ||
| Estimate median \(= 30+\frac{145}{570}\times10\) | M1 | M1 for attempt to interpolate \(\frac{145k}{570k}\times10\) (2.54 or better suggests this) |
| Median \(= 32.5\) years \((32.54\ldots)\); if no working shown then 32.54 or better needed for M1A1; if 32.5 seen with no previous working allow SC1 | A1 | cao min 1dp |
| 3 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| Frequency densities: \(56, 65, 77, 59, 45, 17\) (accept 45.33 and 17.43 for 45 and 17) | B1 | B1 for any one correct; B1 for all correct (soi by listing or from histogram) |
| Histogram with linear scales on both axes (no inequalities) | G1 | Note: all G marks below dep on attempt at frequency density, NOT frequency |
| Heights FT their listed fds or all correct; also correct widths; all blocks joined | G1 | |
| Appropriate label for vertical scale e.g. 'Frequency density (thousands)', 'frequency (thousands) per 10 years', 'thousands of people per 10 years' (allow key), OR f.d. | G1 | |
| 5 |
| Answer | Marks | Guidance |
|---|---|---|
| Any two suitable comments such as: | E1, E1 | Two marks for any two valid comments |
| - Outer London has a greater proportion (or %) of people under 20 (or almost equal proportion) | E1 | |
| - The modal group in Inner London is 20-30 but in Outer London it is 30-40 | E1 | |
| - Outer London has a greater proportion (14%) of aged 65+ | E1 | |
| - All populations in each age group are higher in Outer London | E1 | |
| - Outer London has a more evenly spread distribution or balanced distribution (ages) o.e. | E1 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Mark | Guidance |
| Mean increase \(\uparrow\) | Any one correct: B1 | |
| Median unchanged \((-)\) | Any two correct: B2 | |
| Midrange increase \(\uparrow\) | Any three correct: B3 | |
| Standard deviation increase \(\uparrow\) | All five correct: B4 | |
| Interquartile range unchanged \((-)\) |
## Question 7:
### Part (i)
| Answer | Marks | Guidance |
|--------|-------|----------|
| Positive | B1 | |
| | **1** | |
### Part (ii)
| Answer | Marks | Guidance |
|--------|-------|----------|
| Number of people $= 20\times33(000)+5\times58(000)$ | M1 | M1 first term; M1(indep) second term |
| $= 660(000)+290(000) = 950{,}000$ | A1 | NB answer of 950 scores M2A0 |
| | **3** | |
### Part (iii)
| Answer | Marks | Guidance |
|--------|-------|----------|
| $a = 1810+340 = 2150$ | M1, A1 | cao 2150 or 2150 thousand (but not 215000) |
| Median = age of $1{,}385(000)^{\text{th}}$ person or $1385.5(000)$ | B1 | B1 for $1{,}385(000)$ or $1385.5$ |
| Age 30, cf $=1{,}240(000)$; age 40, cf $=1{,}810(000)$ | | |
| Estimate median $= 30+\frac{145}{570}\times10$ | M1 | M1 for attempt to interpolate $\frac{145k}{570k}\times10$ (2.54 or better suggests this) |
| Median $= 32.5$ years $(32.54\ldots)$; if no working shown then 32.54 or better needed for M1A1; if 32.5 seen with no previous working allow SC1 | A1 | cao min 1dp |
| | **3** | |
### Part (iv)
| Answer | Marks | Guidance |
|--------|-------|----------|
| Frequency densities: $56, 65, 77, 59, 45, 17$ (accept 45.33 and 17.43 for 45 and 17) | B1 | B1 for any one correct; B1 for all correct (soi by listing or from histogram) |
| Histogram with linear scales on both axes (no inequalities) | G1 | Note: all G marks below dep on attempt at frequency density, NOT frequency |
| Heights FT their listed fds or all correct; also correct widths; all blocks joined | G1 | |
| Appropriate label for vertical scale e.g. 'Frequency density (thousands)', 'frequency (thousands) per 10 years', 'thousands of people per 10 years' (allow key), OR f.d. | G1 | |
| | **5** | |
## Question (v):
Any two suitable comments such as: | E1, E1 | Two marks for any two valid comments
- Outer London has a greater proportion (or %) of people under 20 (or almost equal proportion) | E1 |
- The modal group in Inner London is 20-30 but in Outer London it is 30-40 | E1 |
- Outer London has a greater proportion (14%) of aged 65+ | E1 |
- **All** populations in **each** age group are higher in Outer London | E1 |
- Outer London has a more evenly spread distribution or balanced distribution (ages) o.e. | E1 |
**Total: 2 marks**
---
## Question (vi):
| Answer | Mark | Guidance |
|--------|------|----------|
| Mean increase $\uparrow$ | | Any one correct: B1 |
| Median unchanged $(-)$ | | Any two correct: B2 |
| Midrange increase $\uparrow$ | | Any three correct: B3 |
| Standard deviation increase $\uparrow$ | | All **five** correct: B4 |
| Interquartile range unchanged $(-)$ | | |
**Total: 4 marks**
---
**TOTAL: 20 marks**
7 The histogram shows the age distribution of people living in Inner London in 2001.\\
\includegraphics[max width=\textwidth, alt={}, center]{93bbc0cf-d3ad-4bc2-a6c6-36a3b8e394a9-4_805_1372_392_401}
Data sourced from he 2001 Census, \href{http://www.statistics.gov.uk}{www.statistics.gov.uk}
\begin{enumerate}[label=(\roman*)]
\item State the type of skewness shown by the distribution.
\item Use the histogram to estimate the number of people aged under 25.
\item The table below shows the cumulative frequency distribution.
\begin{center}
\begin{tabular}{ | l | c | c | c | c | c | c | }
\hline
Age & 20 & 30 & 40 & 50 & 65 & 100 \\
\hline
Cumulative frequency (thousands) & 660 & 1240 & 1810 & $a$ & 2490 & 2770 \\
\hline
\end{tabular}
\end{center}
(A) Use the histogram to find the value of $a$.\\
(B) Use the table to calculate an estimate of the median age of these people.
The ages of people living in Outer London in 2001 are summarised below.
\begin{center}
\begin{tabular}{ | l | c | c | c | c | c | c | }
\hline
Age ( $x$ years) & $0 \leqslant x < 20$ & $20 \leqslant x < 30$ & $30 \leqslant x < 40$ & $40 \leqslant x < 50$ & $50 \leqslant x < 65$ & $65 \leqslant x < 100$ \\
\hline
Frequency (thousands) & 1120 & 650 & 770 & 590 & 680 & 610 \\
\hline
\end{tabular}
\end{center}
\item Illustrate these data by means of a histogram.
\item Make two brief comments on the differences between the age distributions of the populations of Inner London and Outer London.
\item The data given in the table for Outer London are used to calculate the following estimates.
Mean 38.5, median 35.7, midrange 50, standard deviation 23.7, interquartile range 34.4.\\
The final group in the table assumes that the maximum age of any resident is 100 years. These estimates are to be recalculated, based on a maximum age of 105, rather than 100. For each of the five estimates, state whether it would increase, decrease or be unchanged.\\[0pt]
[4]
\end{enumerate}
\hfill \mbox{\textit{OCR MEI S1 Q7 [4]}}