AQA S2 2012 January — Question 3 13 marks

Exam BoardAQA
ModuleS2 (Statistics 2)
Year2012
SessionJanuary
Marks13
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicChi-squared test of independence
TypeChi-squared test with algebraic entries
DifficultyStandard +0.3 This is a straightforward chi-squared test of independence with standard bookwork in part (a) requiring recall of expected frequency formula and simple algebraic verification, followed by a routine application in part (b) with a 2×2 table. The calculations are mechanical and the question requires no novel insight—slightly easier than average due to the small table size and guided structure.
Spec5.06a Chi-squared: contingency tables

3
  1. Table 1 contains the observed frequencies, \(a , b , c\) and \(d\), relating to the two attributes, \(X\) and \(Y\), required to perform a \(\chi ^ { 2 }\) test. \begin{table}[h]
    \captionsetup{labelformat=empty} \caption{Table 1}
    \cline { 2 - 4 } \multicolumn{1}{c|}{}\(\boldsymbol { Y }\)Not \(\boldsymbol { Y }\)Total
    \(\boldsymbol { X }\)\(a\)\(b\)\(m\)
    Not \(\boldsymbol { X }\)\(c\)\(d\)\(n\)
    Total\(p\)\(q\)\(N\)
    \end{table}
    1. Write down, in terms of \(m , n , p , q\) and \(N\), expressions for the 4 expected frequencies corresponding to \(a , b , c\) and \(d\).
    2. Hence prove that the sum of the expected frequencies is \(N\).
  2. Andy, a tennis player, wishes to investigate the possible effect of wind conditions on the results of his matches. The results of his matches for the 2011 season are represented in Table 2. \begin{table}[h]
    \captionsetup{labelformat=empty} \caption{Table 2}
    \cline { 2 - 4 } \multicolumn{1}{c|}{}WindyNot windyTotal
    Won151833
    Lost12517
    Total272350
    \end{table} Conduct a \(\chi ^ { 2 }\) test, at the \(10 \%\) level of significance, to investigate whether there is an association between Andy's results and wind conditions.
    (8 marks)

(a)(i)
AnswerMarks Guidance
\(E_i: \frac{mp}{N}, \frac{mq}{N}, \frac{np}{N}, \frac{nq}{N}\)B2,1 B1 any one correct entry for \(n = 1, 2, 4\) B2 all correct (simplified)
(a)(ii)
AnswerMarks Guidance
\(\sum E_i = \frac{mp + mq + np + nq}{N} = \frac{m(p+q) + n(p+q)}{N}\) (oe) \(= \frac{mN + nN}{N} = m + n = N\) (since \(p + q = m + n = N\))M1 Mdep1 Adep1 \(\sum E_i = \frac{mp + mq + np + nq}{N} = \frac{m(p+q) + n(p+q)}{N}\) (or use of unsimplified forms) \(= \frac{(p+q)(m+n)}{N} = \frac{N \times N}{N} = N\)
(AG)
(b)
AnswerMarks Guidance
\(H_0:\) No association between Andy's results and wind conditionsB1
\(E_i:\)M1 Attempt E's
\(17.82, 15.18, 33\) \(9.18, 7.82, 17\) \(27, 23, 50\)
\(\Rightarrow [0, -E] - 0.5 = 2.32\)M1 Yates' correction attempted
\(X^2 = 0.3020 + 0.3546 + 0.5863 + 0.6883 = 1.93\)M1 A1 Final column attempted awrt
\(\chi^2_{10\%}(1) = 2.706\)B1 correct value of \(\chi^2\) only (allow 2.71)
\(\Rightarrow\) Accept \(H_0\)Adep1 dep (B1 for \(H_0\))
No association (between Andy's results and wind conditions)Edep1 Total Appropriate conclusion dep(B1 for \(H_0\): M1final column; \(\chi^2_{\text{10%}} = 2.706\))
Total: 13 marks
## (a)(i)
| $E_i: \frac{mp}{N}, \frac{mq}{N}, \frac{np}{N}, \frac{nq}{N}$ | B2,1 | B1 any one correct entry for $n = 1, 2, 4$ B2 all correct (simplified) |

## (a)(ii)
| $\sum E_i = \frac{mp + mq + np + nq}{N} = \frac{m(p+q) + n(p+q)}{N}$ (oe) $= \frac{mN + nN}{N} = m + n = N$ (since $p + q = m + n = N$) | M1 Mdep1 Adep1 | $\sum E_i = \frac{mp + mq + np + nq}{N} = \frac{m(p+q) + n(p+q)}{N}$ (or use of unsimplified forms) $= \frac{(p+q)(m+n)}{N} = \frac{N \times N}{N} = N$ |
| | | (AG) |

## (b)
| $H_0:$ No association between Andy's results and wind conditions | B1 | |
| $E_i:$ | M1 | Attempt E's |
| $17.82, 15.18, 33$ $9.18, 7.82, 17$ $27, 23, 50$ | | |
| $\Rightarrow [0, -E] - 0.5 = 2.32$ | M1 | Yates' correction attempted |
| $X^2 = 0.3020 + 0.3546 + 0.5863 + 0.6883 = 1.93$ | M1 A1 | Final column attempted awrt |
| $\chi^2_{10\%}(1) = 2.706$ | B1 | correct value of $\chi^2$ only (allow 2.71) |
| $\Rightarrow$ Accept $H_0$ | Adep1 | dep (B1 for $H_0$) |
| **No association** (between Andy's results and wind conditions) | Edep1 | **Total** Appropriate conclusion dep(B1 for $H_0$: M1final column; $\chi^2_{\text{10%}} = 2.706$) |

**Total: 13 marks**

---
3
\begin{enumerate}[label=(\alph*)]
\item Table 1 contains the observed frequencies, $a , b , c$ and $d$, relating to the two attributes, $X$ and $Y$, required to perform a $\chi ^ { 2 }$ test.

\begin{table}[h]
\begin{center}
\captionsetup{labelformat=empty}
\caption{Table 1}
\begin{tabular}{ | c | c | c | c | }
\cline { 2 - 4 }
\multicolumn{1}{c|}{} & $\boldsymbol { Y }$ & Not $\boldsymbol { Y }$ & Total \\
\hline
$\boldsymbol { X }$ & $a$ & $b$ & $m$ \\
\hline
Not $\boldsymbol { X }$ & $c$ & $d$ & $n$ \\
\hline
Total & $p$ & $q$ & $N$ \\
\hline
\end{tabular}
\end{center}
\end{table}
\begin{enumerate}[label=(\roman*)]
\item Write down, in terms of $m , n , p , q$ and $N$, expressions for the 4 expected frequencies corresponding to $a , b , c$ and $d$.
\item Hence prove that the sum of the expected frequencies is $N$.
\end{enumerate}\item Andy, a tennis player, wishes to investigate the possible effect of wind conditions on the results of his matches. The results of his matches for the 2011 season are represented in Table 2.

\begin{table}[h]
\begin{center}
\captionsetup{labelformat=empty}
\caption{Table 2}
\begin{tabular}{ | c | c | c | c | }
\cline { 2 - 4 }
\multicolumn{1}{c|}{} & Windy & Not windy & Total \\
\hline
Won & 15 & 18 & 33 \\
\hline
Lost & 12 & 5 & 17 \\
\hline
Total & 27 & 23 & 50 \\
\hline
\end{tabular}
\end{center}
\end{table}

Conduct a $\chi ^ { 2 }$ test, at the $10 \%$ level of significance, to investigate whether there is an association between Andy's results and wind conditions.\\
(8 marks)
\end{enumerate}

\hfill \mbox{\textit{AQA S2 2012 Q3 [13]}}