| Exam Board | OCR MEI |
|---|---|
| Module | S2 (Statistics 2) |
| Year | 2013 |
| Session | January |
| Marks | 18 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Chi-squared test of independence |
| Type | Standard 2×2 contingency table |
| Difficulty | Moderate -0.3 Part (a) is a standard 2×2 chi-squared test of independence with straightforward calculation of expected frequencies and test statistic. Part (b) is a routine z-test with known variance. Both are textbook applications requiring only procedural execution with no novel insight, though the combination and multi-step nature elevates it slightly above pure recall. |
| Spec | 5.06a Chi-squared: contingency tables |
| Hours worked | |||
| \cline { 3 - 4 } \multicolumn{2}{|c|}{} | Less than 5 | At least 5 | |
| \multirow{2}{*}{Grade} | A or B | 20 | 11 |
| \cline { 2 - 4 } | C or lower | 13 | 16 |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| \(H_0\): no association between grade and hours worked; \(H_1\): some association between grade and hours worked | B1 | Hypotheses in context |
| Expected values table: A or B: Less than 5hrs = 17.05, At least 5hrs = 13.95; C or lower: Less than 5hrs = 15.95, At least 5hrs = 13.05 | M1, A1 | Any row/column correct for M1; For expected values (to 2 dp) for A1 |
| \((O-E)^2/E\) table: A or B: Less than 5hrs = 0.5104, At least 5hrs = 0.6238; C or lower: Less than 5hrs = 0.5456, At least 5hrs = 0.6669 | M1, A1 | For valid attempt at \((O-E)^2/E\), any row/column correct for M1; For all correct for A1. NB: These M1A1 marks cannot be implied by a correct final value of \(X^2\) |
| \(X^2 = 2.347\) | B1 | |
| Refer to \(\chi_1^2\) | M1 | For 1 deg of freedom. No FT from here if wrong |
| Critical value at 5% level \(= 3.841\) | A1 | CAO for cv or \(p\)-value \(= 0.1255\). SC1 for cv or \(p\)-value if 1 dof not seen |
| Result is not significant. There is insufficient evidence to suggest that there is any association between hours worked and grade. | E1 | For conclusion in context. NB if \(H_0\), \(H_1\) reversed, or 'correlation' mentioned, do not award first B1 or final E1 |
| [9] |
| Answer | Marks | Guidance |
|---|---|---|
| Answer | Marks | Guidance |
| \(\bar{x} = 417.79\) | B1 | For \(\bar{x}\) |
| \(H_0: \mu = 420\) | B1 | For use of 420 in hypotheses. Hypotheses in words must refer to population. Do not allow alternative symbols unless clearly defined as the population mean |
| \(H_1: \mu \neq 420\) | B1 | For both correct |
| Where \(\mu\) denotes the mean volume of the cans of tomato purée (in the population) | B1 | For definition of \(\mu\). Condone omission of "population" if correct notation \(\mu\) is used, but if \(\mu\) is defined as the sample mean then award B0 |
| Test statistic \(= \dfrac{417.79 - 420}{3.5/\sqrt{10}} = \dfrac{-2.21}{1.107} = -1.997\) | M1*, A1 | Must include \(\sqrt{10}\); FT their \(\bar{x}\) |
| Lower 1% level 2 tailed critical value of \(z = -2.576\) | B1* | For \(-2.576\). Must be \(-2.576\) unless it is clear that absolute values are being used |
| \(-1.997 > -2.576\) | M1 dep* | For sensible comparison leading to a conclusion |
| So not significant. There is insufficient evidence to reject \(H_0\). There is insufficient evidence to suggest that the average volumes of the cans of tomato purée is not 420ml | A1 | For conclusion in words in context provided that correct cv used. FT only candidate's test statistic |
| [9] |
# Question 4(a):
| Answer | Marks | Guidance |
|--------|-------|----------|
| $H_0$: no association between grade and hours worked; $H_1$: some association between grade and hours worked | B1 | Hypotheses in context |
| Expected values table: A or B: Less than 5hrs = 17.05, At least 5hrs = 13.95; C or lower: Less than 5hrs = 15.95, At least 5hrs = 13.05 | M1, A1 | Any row/column correct for M1; For expected values (to 2 dp) for A1 |
| $(O-E)^2/E$ table: A or B: Less than 5hrs = 0.5104, At least 5hrs = 0.6238; C or lower: Less than 5hrs = 0.5456, At least 5hrs = 0.6669 | M1, A1 | For valid attempt at $(O-E)^2/E$, any row/column correct for M1; For all correct for A1. NB: These M1A1 marks cannot be implied by a correct final value of $X^2$ |
| $X^2 = 2.347$ | B1 | |
| Refer to $\chi_1^2$ | M1 | For 1 deg of freedom. No FT from here if wrong |
| Critical value at 5% level $= 3.841$ | A1 | CAO for cv or $p$-value $= 0.1255$. SC1 for cv or $p$-value if 1 dof not seen |
| Result is not significant. There is insufficient evidence to suggest that there is any association between hours worked and grade. | E1 | For conclusion in context. NB if $H_0$, $H_1$ reversed, or 'correlation' mentioned, do not award first B1 or final E1 |
| **[9]** | | |
---
# Question 4(b):
| Answer | Marks | Guidance |
|--------|-------|----------|
| $\bar{x} = 417.79$ | B1 | For $\bar{x}$ |
| $H_0: \mu = 420$ | B1 | For use of 420 in hypotheses. Hypotheses in words must refer to population. Do not allow alternative symbols unless clearly defined as the population mean |
| $H_1: \mu \neq 420$ | B1 | For both correct |
| Where $\mu$ denotes the mean volume of the cans of tomato purée (in the population) | B1 | For definition of $\mu$. Condone omission of "population" if correct notation $\mu$ is used, but if $\mu$ is defined as the **sample** mean then award **B0** |
| Test statistic $= \dfrac{417.79 - 420}{3.5/\sqrt{10}} = \dfrac{-2.21}{1.107} = -1.997$ | M1*, A1 | Must include $\sqrt{10}$; FT their $\bar{x}$ |
| Lower 1% level 2 tailed critical value of $z = -2.576$ | B1* | For $-2.576$. Must be $-2.576$ unless it is clear that absolute values are being used |
| $-1.997 > -2.576$ | M1 dep* | For sensible comparison leading to a conclusion |
| So not significant. There is insufficient evidence to reject $H_0$. There is insufficient evidence to suggest that the average volumes of the cans of tomato purée is not 420ml | A1 | For conclusion in words in context provided that correct cv used. FT only candidate's test statistic |
| **[9]** | | |
### Additional Notes for 4(b):
**Critical Value Method:** $420 - 2.576 \times 3.5 \div \sqrt{10}$ gets M1\*B1\* $= 417.148\ldots$ gets A1; $417.79 > 417.148$ gets M1dep\*; A1 still available for correct conclusion in words & context
**Confidence Interval Method:** CI centred on $417.79 \pm 2.5756 \times 3.5 \div \sqrt{10}$ gets M1\*B1\* $= (414.93\ldots, 420.64\ldots)$ gets A1; "Contains 420" gets M1dep\*; A1 still available. Final M1dep\* A1 available only if 2.576 used.
**Probability Method:** Finding $P(\text{sample mean} < 417.79) = 0.0229$ gets M1\*A1B1\*; $0.0229 > \mathbf{0.005}$\* gets M1dep\*; A1 available for correct conclusion in words & context.
4
\begin{enumerate}[label=(\alph*)]
\item A random sample of 60 students studying mathematics was selected. Their grades in the Core 1 module are summarised in the table below, classified according to whether they worked less than 5 hours per week or at least 5 hours per week. Test, at the $5 \%$ significance level, whether there is any association between grade and hours worked.
\begin{center}
\begin{tabular}{ | c | l | c | c | }
\hline
\multicolumn{2}{|c|}{} & \multicolumn{2}{c|}{Hours worked} \\
\cline { 3 - 4 }
\multicolumn{2}{|c|}{} & Less than 5 & At least 5 \\
\hline
\multirow{2}{*}{Grade} & A or B & 20 & 11 \\
\cline { 2 - 4 }
& C or lower & 13 & 16 \\
\hline
\end{tabular}
\end{center}
\item At a canning factory, cans are filled with tomato purée. The machine which fills the cans is set so that the volume of tomato purée in a can, measured in millilitres, is Normally distributed with mean 420 and standard deviation 3.5. After the machine is recalibrated, a quality control officer wishes to check whether the mean is still 420 millilitres. A random sample of 10 cans of tomato purée is selected and the volumes, measured in millilitres, are as follows.
$$\begin{array} { l l l l l l l l l l }
417.2 & 422.6 & 414.3 & 419.6 & 420.4 & 410.0 & 418.3 & 416.9 & 418.9 & 419.7
\end{array}$$
Carry out a test at the $1 \%$ significance level to investigate whether the mean is still 420 millilitres. You should assume that the volumes are Normally distributed with unchanged standard deviation.
\end{enumerate}
\hfill \mbox{\textit{OCR MEI S2 2013 Q4 [18]}}