| Exam Board | OCR |
|---|---|
| Module | S1 (Statistics 1) |
| Year | 2007 |
| Session | January |
| Marks | 13 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Measures of Location and Spread |
| Type | Calculate statistics from discrete frequency table |
| Difficulty | Moderate -0.3 This is a standard S1 statistics question requiring routine calculations (median, IQR, mean, standard deviation from a frequency table) and straightforward interpretation. While multi-part with several marks, all techniques are textbook exercises with no novel problem-solving required. The conceptual questions at the end require only basic understanding of statistical measures, making this slightly easier than average overall. |
| Spec | 2.02f Measures of average and spread2.02g Calculate mean and standard deviation |
| \cline { 2 - 8 } \multicolumn{1}{c|}{} | Household size | ||||||
| \cline { 2 - 8 } \multicolumn{1}{c|}{} | 1 | 2 | 3 | 4 | 5 | 6 | 7 or more |
| Withington | 34.1 | 26.1 | 12.7 | 12.8 | 8.2 | 4.0 | 2.1 |
| Old Moat | 35.1 | 27.1 | 14.7 | 11.4 | 7.6 | 2.8 | 1.3 |
| Median |
| Mean |
| ||||
| 2 | 2 | 2.4 | 1.5 |
| Answer | Marks | Guidance |
|---|---|---|
| Med = 2 | B1 | |
| LQ = 1 or UQ = 4 | M1 | cao or if treat as cont data: read cf curve or interp at 25 & 75 |
| Answer | Marks | Guidance |
|---|---|---|
| IQR = 3 | A1 | 3 marks cao |
| \(xf\) attempted | M1 | \(\geq 5\) terms; allow "midpts" in \(xf\) or \(x^2f\) |
| \(\geq 5\) terms | A1 | |
| \(2.6\) or 3 sf ans that rounds to 2.6 | A1 | \(x^2f\) or \(.x-m)^2f\) \(\geq 5\) terms; M1 |
| \(\sqrt{\frac{x^2f / 100 - m^2}{}}\) or \(\sqrt{\frac{(x-m)^2f}{100}}\) fully correct but ft \(m\) | M1 | dep M3 |
| \(1.6\) or \(1.7\) or 3 sf ans that rounds to 1.6 or 1.7 | A1 | 6 marks; penalize \(> 3\) sfs only once |
| Answer | Marks | Guidance |
|---|---|---|
| Median less affected by extremes or outliers etc (NOT anomalies) | B1 | 1 mark; or median is an integer or mean not int. or not affected by open-ended interval |
| Answer | Marks | Guidance |
|---|---|---|
| Small change in var'n leads to lge change in IQR UQ or W only just 3; hence IQR generated orig data shows variations are similar | B1 | 1 mark; for Old Moat LQ only just 1 & UQ only just 3 oe specific comment essential |
| Answer | Marks |
|---|---|
| OM % (or y) decr (as x incr) oe Old Moat | B1 |
| B1 | 2 marks; NIS |
**i)**
Med = 2 | B1 |
LQ = 1 or UQ = 4 | M1 | cao or if treat as cont data: read cf curve or interp at 25 & 75
**ii)**
IQR = 3 | A1 | 3 marks cao
$xf$ attempted | M1 | $\geq 5$ terms; allow "midpts" in $xf$ or $x^2f$
$\geq 5$ terms | A1 |
$2.6$ or 3 sf ans that rounds to 2.6 | A1 | $x^2f$ or $.x-m)^2f$ $\geq 5$ terms; M1
$\sqrt{\frac{x^2f / 100 - m^2}{}}$ or $\sqrt{\frac{(x-m)^2f}{100}}$ fully correct but ft $m$ | M1 | dep M3
$1.6$ or $1.7$ or 3 sf ans that rounds to 1.6 or 1.7 | A1 | 6 marks; penalize $> 3$ sfs only once
**iii)**
Median less affected by extremes or outliers etc (NOT anomalies) | B1 | 1 mark; or median is an integer or mean not int. or not affected by open-ended interval
**iv)**
Small change in var'n leads to lge change in IQR UQ or W only just 3; hence IQR generated orig data shows variations are similar | B1 | 1 mark; for Old Moat LQ only just 1 & UQ only just 3 oe specific comment essential
**v)**
OM % (or y) decr (as x incr) oe Old Moat | B1 |
| B1 | 2 marks; NIS
**Total for Question 8:** 13 marks
8 In the 2001 census, the household size (the number of people living in each household) was recorded. The percentages of households of different sizes were then calculated. The table shows the percentages for two wards, Withington and Old Moat, in Manchester.
\begin{center}
\begin{tabular}{ | l | c | c | c | c | c | c | c | }
\cline { 2 - 8 }
\multicolumn{1}{c|}{} & \multicolumn{7}{|c|}{Household size} \\
\cline { 2 - 8 }
\multicolumn{1}{c|}{} & 1 & 2 & 3 & 4 & 5 & 6 & 7 or more \\
\hline
Withington & 34.1 & 26.1 & 12.7 & 12.8 & 8.2 & 4.0 & 2.1 \\
\hline
Old Moat & 35.1 & 27.1 & 14.7 & 11.4 & 7.6 & 2.8 & 1.3 \\
\hline
\end{tabular}
\end{center}
(i) Calculate the median and interquartile range of the household size for Withington.\\
(ii) Making an appropriate assumption for the last class, which should be stated, calculate the mean and standard deviation of the household size for Withington. Give your answers to an appropriate degree of accuracy.
The corresponding results for Old Moat are as follows.
\begin{center}
\begin{tabular}{ | c | c | c | c | }
\hline
Median & \begin{tabular}{ c }
Interquartile \\
range \\
\end{tabular} & Mean & \begin{tabular}{ c }
Standard \\
deviation \\
\end{tabular} \\
\hline
2 & 2 & 2.4 & 1.5 \\
\hline
\end{tabular}
\end{center}
(iii) State one advantage of using the median rather than the mean as a measure of the average household size.\\
(iv) By comparing the values for Withington with those for Old Moat, explain briefly why the interquartile range may be less suitable than the standard deviation as a measure of the variation in household size.\\
(v) For one of the above wards, the value of Spearman's rank correlation coefficient between household size and percentage is - 1 . Without any calculation, state which ward this is. Explain your answer.
\hfill \mbox{\textit{OCR S1 2007 Q8 [13]}}