| Exam Board | CAIE |
|---|---|
| Module | S1 (Statistics 1) |
| Year | 2022 |
| Session | June |
| Marks | 9 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Measures of Location and Spread |
| Type | Histogram from continuous grouped data |
| Difficulty | Moderate -0.8 This is a standard S1 histogram and summary statistics question requiring routine procedures: calculating frequency density for unequal class widths, computing standard deviation from grouped data using the given mean, identifying quartile position, and conceptual understanding of how data changes affect spread. All techniques are textbook exercises with no novel problem-solving required. |
| Spec | 2.02b Histogram: area represents frequency2.02f Measures of average and spread2.02g Calculate mean and standard deviation |
| Time taken \(( t\) minutes \()\) | \(0 \leqslant t < 20\) | \(20 \leqslant t < 30\) | \(30 \leqslant t < 40\) | \(40 \leqslant t < 60\) | \(60 \leqslant t < 90\) |
| Frequency | 440 | 720 | 920 | 300 | 120 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Mark | Guidance |
| Class widths: 20, 10, 10, 20, 30; Frequency densities: 22, 72, 92, 15, 4 | M1 | At least 4 frequency densities calculated (Frequency \(\div\) class width, e.g. \(\frac{440}{20}\), condone \(\frac{440}{19.5}, \frac{440}{20.5}\)). Accept unsimplified, may be read from graph |
| All heights correct on graph | A1 | NOT FT |
| Bar ends at \([0,]\ 20, 30, 40, 60, 90\) at axis with horizontal linear scale, at least 3 values indicated, \(0 \leqslant\) horizontal scale \(\leqslant 90\) | B1 | |
| Axes labelled frequency density (fd), time (\(t\)) and minutes (mins) or in title. Linear vertical scale, at least 3 values indicated \(0 \leqslant\) vertical axes \(\leqslant 92\) (condone 90 used) | B1 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Mark | Guidance |
| Midpoints: 10, 25, 35, 50, 75 | B1 | At least 4 correct midpoints seen |
| \(Variance = \frac{440\times10^2 + 720\times25^2 + 920\times35^2 + 300\times50^2 + 120\times75^2}{2500} - 31.44^2\) | M1 | Correct formula for variance or SD (\(-\) mean\(^2\) included with their midpoints, not upper bound, lower bound, class width, frequency density, frequency or cumulative frequency) and their \(\sum f\) if calculated. Condone 1 data error. |
| Standard deviation \(= 15.2\) | A1 | WWW, allow \(15.16[3\ldots]\) |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Mark | Guidance |
| \(30\)–\(40\) | B1 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Mark | Guidance |
| Stays the same, data still in same intervals | B1 | Frequencies unchanged |
## Question 3(a):
| Answer | Mark | Guidance |
|--------|------|----------|
| Class widths: 20, 10, 10, 20, 30; Frequency densities: 22, 72, 92, 15, 4 | M1 | At least 4 frequency densities calculated (Frequency $\div$ class width, e.g. $\frac{440}{20}$, condone $\frac{440}{19.5}, \frac{440}{20.5}$). Accept unsimplified, may be read from graph |
| All heights correct on graph | A1 | NOT FT |
| Bar ends at $[0,]\ 20, 30, 40, 60, 90$ at axis with horizontal linear scale, at least 3 values indicated, $0 \leqslant$ horizontal scale $\leqslant 90$ | B1 | |
| Axes labelled frequency density (fd), time ($t$) and minutes (mins) or in title. Linear vertical scale, at least 3 values indicated $0 \leqslant$ vertical axes $\leqslant 92$ (condone 90 used) | B1 | |
---
## Question 3(b):
| Answer | Mark | Guidance |
|--------|------|----------|
| Midpoints: 10, 25, 35, 50, 75 | B1 | At least 4 correct midpoints seen |
| $Variance = \frac{440\times10^2 + 720\times25^2 + 920\times35^2 + 300\times50^2 + 120\times75^2}{2500} - 31.44^2$ | M1 | Correct formula for variance or SD ($-$ mean$^2$ included with their midpoints, not upper bound, lower bound, class width, frequency density, frequency or cumulative frequency) and their $\sum f$ if calculated. Condone 1 data error. |
| Standard deviation $= 15.2$ | A1 | WWW, allow $15.16[3\ldots]$ |
---
## Question 3(c):
| Answer | Mark | Guidance |
|--------|------|----------|
| $30$–$40$ | B1 | |
---
## Question 3(d):
| Answer | Mark | Guidance |
|--------|------|----------|
| Stays the same, data still in same intervals | B1 | Frequencies unchanged |
---
3 The times taken to travel to college by 2500 students are summarised in the table.
\begin{center}
\begin{tabular}{ | l | c | c | c | c | c | }
\hline
Time taken $( t$ minutes $)$ & $0 \leqslant t < 20$ & $20 \leqslant t < 30$ & $30 \leqslant t < 40$ & $40 \leqslant t < 60$ & $60 \leqslant t < 90$ \\
\hline
Frequency & 440 & 720 & 920 & 300 & 120 \\
\hline
\end{tabular}
\end{center}
\begin{enumerate}[label=(\alph*)]
\item Draw a histogram to represent this information.\\
\includegraphics[max width=\textwidth, alt={}, center]{d69f6a47-7c88-46b3-9e8f-07727106e987-04_1201_1198_1050_516}
From the data, the estimate of the mean value of $t$ is 31.44 .
\item Calculate an estimate of the standard deviation of the times taken to travel to college.
\item In which class interval does the upper quartile lie?\\
It was later discovered that the times taken to travel to college by two students were incorrectly recorded. One student's time was recorded as 15 instead of 5 and the other's time was recorded as 65 instead of 75 .
\item Without doing any further calculations, state with a reason whether the estimate of the standard deviation in part (b) would be increased, decreased or stay the same.
\end{enumerate}
\hfill \mbox{\textit{CAIE S1 2022 Q3 [9]}}