Edexcel S1 2014 June — Question 1 9 marks

Exam BoardEdexcel
ModuleS1 (Statistics 1)
Year2014
SessionJune
Marks9
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicMeasures of Location and Spread
TypeInterpret or analyse given back-to-back stem-and-leaf
DifficultyModerate -0.8 This is a straightforward S1 question requiring basic stem-and-leaf reading skills and standard quartile/outlier calculations. Finding quartiles from ordered data and applying the 1.5×IQR outlier rule are routine procedures covered early in statistics courses, requiring minimal problem-solving beyond careful counting and arithmetic.
Spec2.02a Interpret single variable data: tables and diagrams2.02f Measures of average and spread2.02g Calculate mean and standard deviation2.02h Recognize outliers2.02i Select/critique data presentation2.02j Clean data: missing data, errors

  1. A random sample of 35 homeowners was taken from each of the villages Greenslax and Penville and their ages were recorded. The results are summarised in the back-to-back stem and leaf diagram below.
TotalsGreenslaxPenvilleTotals
(2)8725567889(7)
(3)98731112344569(11)
(4)4440401247(5)
(5)66522500555(5)
(7)865421162566(4)
(8)8664311705(2)
(5)984328(0)
(1)499(1)
Key: 7 | 3 | 1 means 37 years for Greenslax and 31 years for Penville
Some of the quartiles for these two distributions are given in the table below.
GreenslaxPenville
Lower quartile, \(Q _ { 1 }\)\(a\)31
Median, \(Q _ { 2 }\)6439
Upper quartile, \(Q _ { 3 }\)\(b\)55
  1. Find the value of \(a\) and the value of \(b\). An outlier is a value that falls either $$\begin{aligned} & \text { more than } 1.5 \times \left( Q _ { 3 } - Q _ { 1 } \right) \text { above } Q _ { 3 } \\ & \text { or more than } 1.5 \times \left( Q _ { 3 } - Q _ { 1 } \right) \text { below } Q _ { 1 } \end{aligned}$$
  2. On the graph paper opposite draw a box plot to represent the data from Penville. Show clearly any outliers.
  3. State the skewness of each distribution. Justify your answers. \includegraphics[max width=\textwidth, alt={}, center]{8270bcae-494c-4248-8229-a72e9e84eab0-03_930_1237_1800_367}

Question 1:
Part (a)
AnswerMarks Guidance
AnswerMark Guidance
\(a = 44\)B1 These answers may be in or near the table
\(b = 76\)B1
(2)
Part (b)
AnswerMarks Guidance
AnswerMark Guidance
\(55 + 1.5(55-31) = 91\) [and \(31 - 1.5(55-31) = -5\)]M1 For sight of \(55 + 1.5(55-31)\) or 91 seen (possibly implied by RH whisker of box plot). May be implied by a fully correct box plot
Box with whiskers drawnB1 1st B1: box with whiskers (condone missing median)
Values 25, 31, 39, 55; RH whisker to end at 75 or 91B1 2nd B1: accuracy must be within 0.5 of a square; lower quartile at 30 or 32 is OK. Two RH whiskers is B0
Outlier plotted at 99 onlyA1 Allow cross to be vertically displaced. If RH whisker goes to 99, 2nd B0 and A0 even if outlier identified (require horizontal "gap" between RH whisker and outlier)
(4)A fully correct box plot scores 4/4. If not fully correct apply scheme and need evidence for M1. If two box plots are seen ignore the one for Greenslax. If not on graph paper M1 max for (b)
Part (c)
AnswerMarks Guidance
AnswerMark Guidance
Greenslax: \([Q_2 - Q_1 = 20,\ Q_3 - Q_2 = 12\) or \((Q_2 - Q_1) > (Q_3 - Q_2)] \Rightarrow\) \(-\)ve (skew)B1 1st B1: Greenslax \(-\)ve skew. We must be able to tell which is which but labels may be implied by their values but not simply from \(Q_3 - Q_2 > Q_2 - Q_1\). If there is just one unlabelled comment assume Penville
Penville: \([Q_2 - Q_1 = 8,\ Q_3 - Q_2 = 16\) or \((Q_3 - Q_2) > (Q_2 - Q_1)] \Rightarrow\) \(+\)ve (skew)B1 2nd B1: Penville \(+\)ve skew. Don't insist on seeing "skew" so just \(-\)ve and \(+\)ve will do. Treat "correlation" as ISW
Justification that is consistentddB1 3rd ddB1: dependent on 1st and 2nd B marks being scored. Justification for both based on: quartiles, median relative to quartiles, or "tail". If only values for \(Q_3 - Q_2\) etc are given they should be correct ft for Greenslax and correct for Penville. If values for Greenslax imply \(+\)ve skew then 1st B0 and 3rd B0
(3)
Total 9
# Question 1:

## Part (a)

| Answer | Mark | Guidance |
|--------|------|----------|
| $a = 44$ | B1 | These answers may be in or near the table |
| $b = 76$ | B1 | |
| | **(2)** | |

## Part (b)

| Answer | Mark | Guidance |
|--------|------|----------|
| $55 + 1.5(55-31) = 91$ [and $31 - 1.5(55-31) = -5$] | M1 | For sight of $55 + 1.5(55-31)$ or 91 seen (possibly implied by RH whisker of box plot). May be implied by a fully correct box plot |
| Box with whiskers drawn | B1 | 1st B1: box with whiskers (condone missing median) |
| Values 25, 31, 39, 55; RH whisker to end at 75 or 91 | B1 | 2nd B1: accuracy must be within 0.5 of a square; lower quartile at 30 or 32 is OK. Two RH whiskers is B0 |
| Outlier plotted at 99 only | A1 | Allow cross to be vertically displaced. If RH whisker goes to 99, 2nd B0 and A0 even if outlier identified (require horizontal "gap" between RH whisker and outlier) |
| | **(4)** | A fully correct box plot scores 4/4. If **not** fully correct apply scheme and need evidence for M1. If two box plots are seen ignore the one for Greenslax. If not on graph paper M1 max for (b) |

## Part (c)

| Answer | Mark | Guidance |
|--------|------|----------|
| Greenslax: $[Q_2 - Q_1 = 20,\ Q_3 - Q_2 = 12$ or $(Q_2 - Q_1) > (Q_3 - Q_2)] \Rightarrow$ $-$ve (skew) | B1 | 1st B1: Greenslax $-$ve skew. We must be able to tell which is which but labels may be implied by their values but not simply from $Q_3 - Q_2 > Q_2 - Q_1$. If there is just one unlabelled comment assume Penville |
| Penville: $[Q_2 - Q_1 = 8,\ Q_3 - Q_2 = 16$ or $(Q_3 - Q_2) > (Q_2 - Q_1)] \Rightarrow$ $+$ve (skew) | B1 | 2nd B1: Penville $+$ve skew. Don't insist on seeing "skew" so just $-$ve and $+$ve will do. Treat "correlation" as ISW |
| Justification that is consistent | ddB1 | 3rd ddB1: dependent on 1st and 2nd B marks being scored. Justification for **both** based on: quartiles, median relative to quartiles, or "tail". If only values for $Q_3 - Q_2$ etc are given they should be correct ft for Greenslax and correct for Penville. If values for Greenslax imply $+$ve skew then 1st B0 and 3rd B0 |
| | **(3)** | |
| | **Total 9** | |
\begin{enumerate}
  \item A random sample of 35 homeowners was taken from each of the villages Greenslax and Penville and their ages were recorded. The results are summarised in the back-to-back stem and leaf diagram below.
\end{enumerate}

\begin{center}
\begin{tabular}{|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|l|}
\hline
Totals & \multicolumn{8}{|c|}{Greenslax} & \multicolumn{3}{|c|}{} & \multicolumn{4}{|r|}{Penville} &  & \multicolumn{3}{|r|}{Totals} \\
\hline
(2) &  &  &  &  &  & 8 & 7 & 2 & 5 & 5 & 6 & 7 & 8 & 8 & 9 &  &  &  & (7) \\
\hline
(3) &  &  &  &  & 9 & 8 & 7 & 3 & 1 & 1 & 1 & 2 & 3 & 4 & 4 & 5 & 6 & 9 & (11) \\
\hline
(4) &  &  &  & 4 & 4 & 4 & 0 & 4 & 0 & 1 & 2 & 4 & 7 &  &  &  &  &  & (5) \\
\hline
(5) &  &  & 6 & 6 & 5 & 2 & 2 & 5 & 0 & 0 & 5 & 5 & 5 &  &  &  &  &  & (5) \\
\hline
(7) & 8 & 6 & 5 & 4 & 2 & 1 & 1 & 6 & 2 & 5 & 6 & 6 &  &  &  &  &  &  & (4) \\
\hline
(8) & 8 & 6 & 6 & 4 & 3 & 1 & 1 & 7 & 0 & 5 &  &  &  &  &  &  &  &  & (2) \\
\hline
(5) &  &  & 9 & 8 & 4 & 3 & 2 & 8 &  &  &  &  &  &  &  &  &  &  & (0) \\
\hline
(1) &  &  &  &  &  &  & 4 & 9 & 9 &  &  &  &  &  &  &  &  &  & (1) \\
\hline
\end{tabular}
\end{center}

Key: 7 | 3 | 1 means 37 years for Greenslax and 31 years for Penville\\
Some of the quartiles for these two distributions are given in the table below.

\begin{center}
\begin{tabular}{ | l | c | c | }
\hline
 & Greenslax & Penville \\
\hline
Lower quartile, $Q _ { 1 }$ & $a$ & 31 \\
\hline
Median, $Q _ { 2 }$ & 64 & 39 \\
\hline
Upper quartile, $Q _ { 3 }$ & $b$ & 55 \\
\hline
\end{tabular}
\end{center}

(a) Find the value of $a$ and the value of $b$.

An outlier is a value that falls either

$$\begin{aligned}
& \text { more than } 1.5 \times \left( Q _ { 3 } - Q _ { 1 } \right) \text { above } Q _ { 3 } \\
& \text { or more than } 1.5 \times \left( Q _ { 3 } - Q _ { 1 } \right) \text { below } Q _ { 1 }
\end{aligned}$$

(b) On the graph paper opposite draw a box plot to represent the data from Penville. Show clearly any outliers.\\
(c) State the skewness of each distribution. Justify your answers.\\

\includegraphics[max width=\textwidth, alt={}, center]{8270bcae-494c-4248-8229-a72e9e84eab0-03_930_1237_1800_367}\\

\hfill \mbox{\textit{Edexcel S1 2014 Q1 [9]}}