| Exam Board | OCR MEI |
|---|---|
| Module | S1 (Statistics 1) |
| Marks | 18 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Data representation |
| Type | Draw histogram from frequency table |
| Difficulty | Moderate -0.8 This is a straightforward multi-part statistics question requiring standard S1 techniques: drawing a histogram with unequal class widths (requiring frequency density calculation), discussing midrange, calculating grouped mean/SD from a frequency table, applying the 1.5×IQR outlier rule, and basic proportional reasoning. All parts are routine textbook exercises with no novel problem-solving required, making it easier than average for A-level. |
| Spec | 2.02b Histogram: area represents frequency2.02f Measures of average and spread2.02g Calculate mean and standard deviation2.02h Recognize outliers |
| Engine size \(x\) | \(500 \leqslant x \leqslant 1000\) | \(1000 < x \leqslant 1500\) | \(1500 < x \leqslant 2000\) | \(2000 < x \leqslant 3000\) | \(3000 < x \leqslant 5000\) |
| Frequency | 7 | 22 | 26 | 18 | 7 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| Frequency density table: \(500 \leq x \leq 1000\): fd = 0.014; \(1000 < x \leq 1500\): fd = 0.044; \(1500 < x \leq 2000\): fd = 0.052; \(2000 < x \leq 3000\): fd = 0.018; \(3000 < x \leq 5000\): fd = 0.0035 | M1 | At least 4 fds correct for M1. M1 can also be gained from freq per 1000: 14, 44, 52, 18, 3.5 (at least 4 correct) and A1 for all correct, or freq per 500: 7, 22, 26, 9, 1.75. Accept any suitable unit for fd, eg freq per 1000, BUT NOT FD per 1000. Allow fds correct to at least 3dp. If fd not explicitly given, M1A1 can be gained from all heights correct (within one square) on histogram (and M1A0 if at least 4 correct). Allow restart with correct heights if given fd wrong |
| All fd's correct | A1 | |
| Linear scales on both axes and label on vertical axis | G1(L1) | Label required on vertical axis IN RELATION to first M1 mark, i.e. fd or frequency density or if relevant freq/1000 etc (NOT fd/1000, but allow \(\text{fd} \times 1000\) etc). Accept f/w or f/cw. Ignore horizontal label and allow horizontal scale to start at 500. Can also be gained from an accurate key |
| Width of bars | G1(W1) | Must be drawn at 500, 1000 etc NOT 499.5 or 500.5 etc. NO GAPS ALLOWED. Must have linear scale. No inequality labels on their own such as \(500 \leq S < 1000\) etc but allow if a clear horizontal linear scale is also given |
| Height of bars | G1(H1) | FT of heights dep on at least 3 heights correct and all must agree with their fds. If fds not given and one height is wrong then max M1A0G1G1G0. Visual check on y (within one square) - no need to measure precisely |
| Incorrect diagrams: Frequency diagrams can get M0, A0, G0, G1, G0 MAXIMUM. Frequency density = frequency \(\times\) width, frequency/midpoint etc gets MAX M0A0G0G1G0. Frequency polygons MAX M1A1G0G0G0 | ||
| Total [5] |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| Do not know exact highest and lowest values so cannot tell what the midrange is. OR No and a counterexample to show it may not be 2750. OR \((500 + 5000)/2 = 2750\). But very unlikely to be absolutely correct but probably close to the true value. Some element of doubt needed. Allow 'Likely to be correct' | E1 | Allow comment such as 'Highest value could be 5000 and lowest could be 500 therefore midrange could be 2750'. NO mark if incorrect calculation. Sight of 1750 AND 3000 (min and max of midrange) scores E1 |
| Total [1] |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| \(\text{Mean} = \frac{(750 \times 7)+(1250 \times 22)+(1750 \times 26)+(2500 \times 18)+(4000 \times 7)}{80}\) | M1 | For midpoints (at least 3 correct). No marks for mean or sd unless using midpoints. Answer must NOT be left as improper fraction. CAO |
| \(= \frac{151250}{80} = 1891\) | A1 | Accept correct answers for mean (1890 or 1891) and sd (850 or 846 or 845.5) from calculator even if eg \(S_{xx}\) given wrong |
| \(\sum x^2 f = (750^2 \times 7)+(1250^2 \times 22)+(1750^2 \times 26)+(2500^2 \times 18)+(4000^2 \times 7)\) \(= 3937500 + 34375000 + 79625000 + 112500000 + 112000000 = 342437500\) | M1 | For sum of at least 3 correct multiples \(fx^2\). Allow M1 for anything which rounds to 342400000 |
| \(S_{xx} = 342437500 - \frac{151250^2}{80} = 56480469\) | ||
| \(s = \sqrt{\frac{56480469}{79}} = \sqrt{714943} = 846\) | A1 | Only penalise once in part (iii) for over-specification, even if mean and standard deviation both over specified. Allow SC1 for RMSD 840.2 or 840 from calculator |
| Only an estimate since the data are grouped | E1 indep | Or for any mention of midpoints or 'don't have actual data' or 'data are not exact' oe |
| Total [5] |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| \(\bar{x} - 2s = 1891 - (2 \times 846) = 199\); Allow 200 | M1 | For either. FT any positive mean and their positive sd/rmsd for M1. Only follow through numerical values, not variables such as \(s\). No marks in (iv) unless using \(\bar{x} + 2s\) or \(\bar{x} - 2s\) |
| \(\bar{x} + 2s = 1891 + (2 \times 846) = 3583\); Allow 3580 or 3600 | A1 | For both (FT). Do NOT penalise over-specification here as it is not the final answer |
| So there are probably some outliers | E1 | Must include an element of doubt. Dep on upper limit in range 3000–5000. Allow comments such as 'any value over 3583 is an outlier'. Ignore comments about possible outliers at lower end |
| Total [3] |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| Number of cars over \(2000 \text{ cm}^3 = \frac{25}{80} \times 2.5 \text{ million} = 781250\) | M1 | For \(\frac{25}{80} \times 2.5\) million or \(\frac{(18+7)}{80} \times 2.5\) million |
| So duty raised \(= 781250 \times £1000 = £781 \text{ million}\) | M1 indep | For something \(\times £1000\) even if this is the first step |
| A1 | CAO. NB £781250000 is over-specified so only 2/3 | |
| Total [3] |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| Because the numbers of cars sold with engine size greater than \(2000 \text{ cm}^3\) might be reduced due to the additional duty | E1 | Allow any other reasonable suggestion. Condone 'sample may not be representative'. Allow 'sample is not of NEW cars' |
| Total [1] |
# Question 2:
## Part (i) - Histogram
| Answer | Marks | Guidance |
|--------|-------|----------|
| Frequency density table: $500 \leq x \leq 1000$: fd = 0.014; $1000 < x \leq 1500$: fd = 0.044; $1500 < x \leq 2000$: fd = 0.052; $2000 < x \leq 3000$: fd = 0.018; $3000 < x \leq 5000$: fd = 0.0035 | M1 | At least 4 fds correct for M1. M1 can also be gained from freq per 1000: 14, 44, 52, 18, 3.5 (at least 4 correct) and A1 for all correct, or freq per 500: 7, 22, 26, 9, 1.75. Accept any suitable unit for fd, eg freq per 1000, BUT NOT FD per 1000. Allow fds correct to at least 3dp. If fd not explicitly given, M1A1 can be gained from all heights correct (within one square) on histogram (and M1A0 if at least 4 correct). Allow restart with correct heights if given fd wrong |
| All fd's correct | A1 | |
| Linear scales on both axes and label on vertical axis | G1(L1) | Label required on vertical axis IN RELATION to first M1 mark, i.e. fd or frequency density or if relevant freq/1000 etc (NOT fd/1000, but allow $\text{fd} \times 1000$ etc). Accept f/w or f/cw. Ignore horizontal label and allow horizontal scale to start at 500. Can also be gained from an accurate key |
| Width of bars | G1(W1) | Must be drawn at 500, 1000 etc NOT 499.5 or 500.5 etc. NO GAPS ALLOWED. Must have linear scale. No inequality labels on their own such as $500 \leq S < 1000$ etc but allow if a clear horizontal linear scale is also given |
| Height of bars | G1(H1) | FT of heights dep on at least 3 heights correct and all must agree with their fds. If fds not given and one height is wrong then max M1A0G1G1G0. Visual check on y (within one square) - no need to measure precisely |
| Incorrect diagrams: Frequency diagrams can get M0, A0, G0, G1, G0 MAXIMUM. Frequency density = frequency $\times$ width, frequency/midpoint etc gets MAX M0A0G0G1G0. **Frequency polygons MAX M1A1G0G0G0** | | |
| **Total [5]** | | |
## Part (ii)
| Answer | Marks | Guidance |
|--------|-------|----------|
| Do not know exact highest and lowest values so cannot tell what the midrange is. OR No and a counterexample to show it may not be 2750. OR $(500 + 5000)/2 = 2750$. But very unlikely to be absolutely correct but probably close to the true value. Some element of doubt needed. Allow 'Likely to be correct' | E1 | Allow comment such as 'Highest value could be 5000 and lowest could be 500 therefore midrange could be 2750'. NO mark if incorrect calculation. Sight of 1750 AND 3000 (min and max of midrange) scores E1 |
| **Total [1]** | | |
## Part (iii)
| Answer | Marks | Guidance |
|--------|-------|----------|
| $\text{Mean} = \frac{(750 \times 7)+(1250 \times 22)+(1750 \times 26)+(2500 \times 18)+(4000 \times 7)}{80}$ | M1 | For midpoints (at least 3 correct). No marks for mean or sd unless using midpoints. Answer must NOT be left as improper fraction. CAO |
| $= \frac{151250}{80} = 1891$ | A1 | Accept correct answers for mean (1890 or 1891) and sd (850 or 846 or 845.5) from calculator even if eg $S_{xx}$ given wrong |
| $\sum x^2 f = (750^2 \times 7)+(1250^2 \times 22)+(1750^2 \times 26)+(2500^2 \times 18)+(4000^2 \times 7)$ $= 3937500 + 34375000 + 79625000 + 112500000 + 112000000 = 342437500$ | M1 | For sum of at least 3 correct multiples $fx^2$. Allow M1 for anything which rounds to 342400000 |
| $S_{xx} = 342437500 - \frac{151250^2}{80} = 56480469$ | | |
| $s = \sqrt{\frac{56480469}{79}} = \sqrt{714943} = 846$ | A1 | Only penalise once in part (iii) for over-specification, even if mean and standard deviation both over specified. Allow SC1 for RMSD 840.2 or 840 from calculator |
| Only an estimate since the data are grouped | E1 indep | Or for any mention of midpoints or 'don't have actual data' or 'data are not exact' oe |
| **Total [5]** | | |
## Part (iv)
| Answer | Marks | Guidance |
|--------|-------|----------|
| $\bar{x} - 2s = 1891 - (2 \times 846) = 199$; Allow 200 | M1 | For either. FT any positive mean and their positive sd/rmsd for M1. Only follow through numerical values, not variables such as $s$. No marks in (iv) unless using $\bar{x} + 2s$ or $\bar{x} - 2s$ |
| $\bar{x} + 2s = 1891 + (2 \times 846) = 3583$; Allow 3580 or 3600 | A1 | For both (FT). Do NOT penalise over-specification here as it is not the final answer |
| So there are probably some outliers | E1 | Must include an element of doubt. Dep on upper limit in range 3000–5000. Allow comments such as 'any value over 3583 is an outlier'. Ignore comments about possible outliers at lower end |
| **Total [3]** | | |
## Part (v)
| Answer | Marks | Guidance |
|--------|-------|----------|
| Number of cars over $2000 \text{ cm}^3 = \frac{25}{80} \times 2.5 \text{ million} = 781250$ | M1 | For $\frac{25}{80} \times 2.5$ million or $\frac{(18+7)}{80} \times 2.5$ million |
| So duty raised $= 781250 \times £1000 = £781 \text{ million}$ | M1 indep | For something $\times £1000$ even if this is the first step |
| | A1 | CAO. NB £781250000 is over-specified so only 2/3 |
| **Total [3]** | | |
## Part (vi)
| Answer | Marks | Guidance |
|--------|-------|----------|
| Because the numbers of cars sold with engine size greater than $2000 \text{ cm}^3$ might be reduced due to the additional duty | E1 | Allow any other reasonable suggestion. Condone 'sample may not be representative'. Allow 'sample is not of NEW cars' |
| **Total [1]** | | |
2 The engine sizes $x \mathrm {~cm} ^ { 3 }$ of a sample of 80 cars are summarised in the table below.
\begin{center}
\begin{tabular}{ | l | c | c | c | c | c | }
\hline
Engine size $x$ & $500 \leqslant x \leqslant 1000$ & $1000 < x \leqslant 1500$ & $1500 < x \leqslant 2000$ & $2000 < x \leqslant 3000$ & $3000 < x \leqslant 5000$ \\
\hline
Frequency & 7 & 22 & 26 & 18 & 7 \\
\hline
\end{tabular}
\end{center}
(i) Draw a histogram to illustrate the distribution.\\
(ii) A student claims that the midrange is $2750 \mathrm {~cm} ^ { 3 }$. Discuss briefly whether he is likely to be correct.\\
(iii) Calculate estimates of the mean and standard deviation of the engine sizes. Explain why your answers are only estimates.\\
(iv) Hence investigate whether there are any outliers in the sample.\\
(v) A vehicle duty of $\pounds 1000$ is proposed for all new cars with engine size greater than $2000 \mathrm {~cm} ^ { 3 }$. Assuming that this sample of cars is representative of all new cars in Britain and that there are 2.5 million new cars registered in Britain each year, calculate an estimate of the total amount of money that this vehicle duty would raise in one year.\\
(vi) Why in practice might your estimate in part (v) turn out to be too high?
\hfill \mbox{\textit{OCR MEI S1 Q2 [18]}}