OCR MEI S2 2013 January — Question 4 18 marks

Exam BoardOCR MEI
ModuleS2 (Statistics 2)
Year2013
SessionJanuary
Marks18
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicChi-squared test of independence
TypeStandard 2×2 contingency table
DifficultyModerate -0.3 Part (a) is a standard 2×2 chi-squared test of independence with straightforward calculation of expected frequencies and test statistic. Part (b) is a routine z-test with known variance. Both are textbook applications requiring only procedural execution with no novel insight, though the combination and multi-step nature elevates it slightly above pure recall.
Spec5.06a Chi-squared: contingency tables

4
  1. A random sample of 60 students studying mathematics was selected. Their grades in the Core 1 module are summarised in the table below, classified according to whether they worked less than 5 hours per week or at least 5 hours per week. Test, at the \(5 \%\) significance level, whether there is any association between grade and hours worked.
    Hours worked
    \cline { 3 - 4 } \multicolumn{2}{|c|}{}Less than 5At least 5
    \multirow{2}{*}{Grade}A or B2011
    \cline { 2 - 4 }C or lower1316
  2. At a canning factory, cans are filled with tomato purée. The machine which fills the cans is set so that the volume of tomato purée in a can, measured in millilitres, is Normally distributed with mean 420 and standard deviation 3.5. After the machine is recalibrated, a quality control officer wishes to check whether the mean is still 420 millilitres. A random sample of 10 cans of tomato purée is selected and the volumes, measured in millilitres, are as follows. $$\begin{array} { l l l l l l l l l l } 417.2 & 422.6 & 414.3 & 419.6 & 420.4 & 410.0 & 418.3 & 416.9 & 418.9 & 419.7 \end{array}$$ Carry out a test at the \(1 \%\) significance level to investigate whether the mean is still 420 millilitres. You should assume that the volumes are Normally distributed with unchanged standard deviation.

Question 4(a):
AnswerMarks Guidance
AnswerMarks Guidance
\(H_0\): no association between grade and hours worked; \(H_1\): some association between grade and hours workedB1 Hypotheses in context
Expected values table: A or B: Less than 5hrs = 17.05, At least 5hrs = 13.95; C or lower: Less than 5hrs = 15.95, At least 5hrs = 13.05M1, A1 Any row/column correct for M1; For expected values (to 2 dp) for A1
\((O-E)^2/E\) table: A or B: Less than 5hrs = 0.5104, At least 5hrs = 0.6238; C or lower: Less than 5hrs = 0.5456, At least 5hrs = 0.6669M1, A1 For valid attempt at \((O-E)^2/E\), any row/column correct for M1; For all correct for A1. NB: These M1A1 marks cannot be implied by a correct final value of \(X^2\)
\(X^2 = 2.347\)B1
Refer to \(\chi_1^2\)M1 For 1 deg of freedom. No FT from here if wrong
Critical value at 5% level \(= 3.841\)A1 CAO for cv or \(p\)-value \(= 0.1255\). SC1 for cv or \(p\)-value if 1 dof not seen
Result is not significant. There is insufficient evidence to suggest that there is any association between hours worked and grade.E1 For conclusion in context. NB if \(H_0\), \(H_1\) reversed, or 'correlation' mentioned, do not award first B1 or final E1
[9]
Question 4(b):
AnswerMarks Guidance
AnswerMarks Guidance
\(\bar{x} = 417.79\)B1 For \(\bar{x}\)
\(H_0: \mu = 420\)B1 For use of 420 in hypotheses. Hypotheses in words must refer to population. Do not allow alternative symbols unless clearly defined as the population mean
\(H_1: \mu \neq 420\)B1 For both correct
Where \(\mu\) denotes the mean volume of the cans of tomato purée (in the population)B1 For definition of \(\mu\). Condone omission of "population" if correct notation \(\mu\) is used, but if \(\mu\) is defined as the sample mean then award B0
Test statistic \(= \dfrac{417.79 - 420}{3.5/\sqrt{10}} = \dfrac{-2.21}{1.107} = -1.997\)M1*, A1 Must include \(\sqrt{10}\); FT their \(\bar{x}\)
Lower 1% level 2 tailed critical value of \(z = -2.576\)B1* For \(-2.576\). Must be \(-2.576\) unless it is clear that absolute values are being used
\(-1.997 > -2.576\)M1 dep* For sensible comparison leading to a conclusion
So not significant. There is insufficient evidence to reject \(H_0\). There is insufficient evidence to suggest that the average volumes of the cans of tomato purée is not 420mlA1 For conclusion in words in context provided that correct cv used. FT only candidate's test statistic
[9]
Additional Notes for 4(b):
Critical Value Method: \(420 - 2.576 \times 3.5 \div \sqrt{10}\) gets M1\*B1\* \(= 417.148\ldots\) gets A1; \(417.79 > 417.148\) gets M1dep\*; A1 still available for correct conclusion in words & context
Confidence Interval Method: CI centred on \(417.79 \pm 2.5756 \times 3.5 \div \sqrt{10}\) gets M1\*B1\* \(= (414.93\ldots, 420.64\ldots)\) gets A1; "Contains 420" gets M1dep\*; A1 still available. Final M1dep\* A1 available only if 2.576 used.
Probability Method: Finding \(P(\text{sample mean} < 417.79) = 0.0229\) gets M1\*A1B1\*; \(0.0229 > \mathbf{0.005}\)\* gets M1dep\*; A1 available for correct conclusion in words & context.
# Question 4(a):

| Answer | Marks | Guidance |
|--------|-------|----------|
| $H_0$: no association between grade and hours worked; $H_1$: some association between grade and hours worked | B1 | Hypotheses in context |
| Expected values table: A or B: Less than 5hrs = 17.05, At least 5hrs = 13.95; C or lower: Less than 5hrs = 15.95, At least 5hrs = 13.05 | M1, A1 | Any row/column correct for M1; For expected values (to 2 dp) for A1 |
| $(O-E)^2/E$ table: A or B: Less than 5hrs = 0.5104, At least 5hrs = 0.6238; C or lower: Less than 5hrs = 0.5456, At least 5hrs = 0.6669 | M1, A1 | For valid attempt at $(O-E)^2/E$, any row/column correct for M1; For all correct for A1. NB: These M1A1 marks cannot be implied by a correct final value of $X^2$ |
| $X^2 = 2.347$ | B1 | |
| Refer to $\chi_1^2$ | M1 | For 1 deg of freedom. No FT from here if wrong |
| Critical value at 5% level $= 3.841$ | A1 | CAO for cv or $p$-value $= 0.1255$. SC1 for cv or $p$-value if 1 dof not seen |
| Result is not significant. There is insufficient evidence to suggest that there is any association between hours worked and grade. | E1 | For conclusion in context. NB if $H_0$, $H_1$ reversed, or 'correlation' mentioned, do not award first B1 or final E1 |
| **[9]** | | |

---

# Question 4(b):

| Answer | Marks | Guidance |
|--------|-------|----------|
| $\bar{x} = 417.79$ | B1 | For $\bar{x}$ |
| $H_0: \mu = 420$ | B1 | For use of 420 in hypotheses. Hypotheses in words must refer to population. Do not allow alternative symbols unless clearly defined as the population mean |
| $H_1: \mu \neq 420$ | B1 | For both correct |
| Where $\mu$ denotes the mean volume of the cans of tomato purée (in the population) | B1 | For definition of $\mu$. Condone omission of "population" if correct notation $\mu$ is used, but if $\mu$ is defined as the **sample** mean then award **B0** |
| Test statistic $= \dfrac{417.79 - 420}{3.5/\sqrt{10}} = \dfrac{-2.21}{1.107} = -1.997$ | M1*, A1 | Must include $\sqrt{10}$; FT their $\bar{x}$ |
| Lower 1% level 2 tailed critical value of $z = -2.576$ | B1* | For $-2.576$. Must be $-2.576$ unless it is clear that absolute values are being used |
| $-1.997 > -2.576$ | M1 dep* | For sensible comparison leading to a conclusion |
| So not significant. There is insufficient evidence to reject $H_0$. There is insufficient evidence to suggest that the average volumes of the cans of tomato purée is not 420ml | A1 | For conclusion in words in context provided that correct cv used. FT only candidate's test statistic |
| **[9]** | | |

### Additional Notes for 4(b):

**Critical Value Method:** $420 - 2.576 \times 3.5 \div \sqrt{10}$ gets M1\*B1\* $= 417.148\ldots$ gets A1; $417.79 > 417.148$ gets M1dep\*; A1 still available for correct conclusion in words & context

**Confidence Interval Method:** CI centred on $417.79 \pm 2.5756 \times 3.5 \div \sqrt{10}$ gets M1\*B1\* $= (414.93\ldots, 420.64\ldots)$ gets A1; "Contains 420" gets M1dep\*; A1 still available. Final M1dep\* A1 available only if 2.576 used.

**Probability Method:** Finding $P(\text{sample mean} < 417.79) = 0.0229$ gets M1\*A1B1\*; $0.0229 > \mathbf{0.005}$\* gets M1dep\*; A1 available for correct conclusion in words & context.
4
\begin{enumerate}[label=(\alph*)]
\item A random sample of 60 students studying mathematics was selected. Their grades in the Core 1 module are summarised in the table below, classified according to whether they worked less than 5 hours per week or at least 5 hours per week. Test, at the $5 \%$ significance level, whether there is any association between grade and hours worked.

\begin{center}
\begin{tabular}{ | c | l | c | c | }
\hline
\multicolumn{2}{|c|}{} & \multicolumn{2}{c|}{Hours worked} \\
\cline { 3 - 4 }
\multicolumn{2}{|c|}{} & Less than 5 & At least 5 \\
\hline
\multirow{2}{*}{Grade} & A or B & 20 & 11 \\
\cline { 2 - 4 }
 & C or lower & 13 & 16 \\
\hline
\end{tabular}
\end{center}
\item At a canning factory, cans are filled with tomato purée. The machine which fills the cans is set so that the volume of tomato purée in a can, measured in millilitres, is Normally distributed with mean 420 and standard deviation 3.5. After the machine is recalibrated, a quality control officer wishes to check whether the mean is still 420 millilitres. A random sample of 10 cans of tomato purée is selected and the volumes, measured in millilitres, are as follows.

$$\begin{array} { l l l l l l l l l l } 
417.2 & 422.6 & 414.3 & 419.6 & 420.4 & 410.0 & 418.3 & 416.9 & 418.9 & 419.7
\end{array}$$

Carry out a test at the $1 \%$ significance level to investigate whether the mean is still 420 millilitres. You should assume that the volumes are Normally distributed with unchanged standard deviation.
\end{enumerate}

\hfill \mbox{\textit{OCR MEI S2 2013 Q4 [18]}}