| Exam Board | OCR MEI |
|---|---|
| Module | Further Statistics Minor (Further Statistics Minor) |
| Year | 2021 |
| Session | November |
| Marks | 13 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Chi-squared goodness of fit |
| Type | Spreadsheet-based chi-squared test |
| Difficulty | Standard +0.3 This is a standard chi-squared test for independence with straightforward calculations. Students need to recall validity conditions, compute expected frequencies and contributions using given formulas, perform a hypothesis test, and interpret results. While it requires multiple steps and understanding of the chi-squared distribution, all techniques are routine for Further Statistics students with no novel problem-solving required. |
| Spec | 5.06a Chi-squared: contingency tables |
| A | B | C | D | E | |
| 1 | \multirow{3}{*}{} | Observed frequency | |||
| 2 | Age | ||||
| 3 | 16-34 | 35-59 | 60 and over | ||
| 4 | \multirow{2}{*}{Smoking status} | Smoker | 13 | 7 | 3 |
| 5 | Non-smoker | 28 | 43 | 26 | |
| 6 | |||||
| 7 | Expected frequency | ||||
| 8 | 7.8583 | ||||
| 9 | 33.1417 | ||||
| 10 | |||||
| 11 | Contributions to the test statistic | ||||
| 12 | 3.3642 | 0.6964 | 1.1775 | ||
| 13 | 0.1651 | 0.2792 | |||
| 11 | |||||
| Answer | Marks | Guidance |
|---|---|---|
| 3 | (a) | The sample must be random |
| [1] | 1.2 | |
| 3 | (b) | 23×29 |
| Answer | Marks |
|---|---|
| = 0.7977 | B1 |
| Answer | Marks |
|---|---|
| [3] | 1.1 |
| Answer | Marks | Guidance |
|---|---|---|
| 3 | (c) | H : No association between age and smoking (status) |
| Answer | Marks |
|---|---|
| (status) | B1 |
| Answer | Marks |
|---|---|
| [6] | 3.4 |
| Answer | Marks |
|---|---|
| 3.5a | Both hypotheses needed |
| Answer | Marks |
|---|---|
| ‘association’ is A0 | Comparing their test |
| Answer | Marks | Guidance |
|---|---|---|
| 3 | (d) | For 16-34 year olds the contribution of 3.3642 suggests |
| Answer | Marks |
|---|---|
| expected. | E1 |
| Answer | Marks |
|---|---|
| [3] | 2.3 |
| Answer | Marks |
|---|---|
| 3.2a | Max of 2 marks out of 3 if no |
| Answer | Marks |
|---|---|
| non-smokers | Should take each age |
Question 3:
3 | (a) | The sample must be random | B1
[1] | 1.2
3 | (b) | 23×29
E8: =5.5583
120
(28−33.1417)2
C13:
33.1417
= 0.7977 | B1
M1
A1
[3] | 1.1
1.1a
1.1
3 | (c) | H : No association between age and smoking (status)
0
H : Some association between age & smoking (status)
1
Degrees of freedom = 2
Critical value = 5.991
Test statistic = 3.3642 + 0.6964 + ... + 0.2792 = 6.4801
6.4801 > 5.991
There is sufficient evidence at the 5% level to suggest
that there is association between age and smoking
(status) | B1
B1
B1
B1FT
M1
A1
[6] | 3.4
3.3
1.1
1.1
2.2b
3.5a | Both hypotheses needed
Use of ‘correlation’ in place of
‘association’ is B0
or
or p-value = 0.0392
2
FT𝜒𝜒 t 2 he ( i6r. 4v8al0u1e ) o=f C01.936 08
or 0.9608 > 0.95 or 0.0392 < 0.05
Correct test and critical values required
Use of ‘correlation’ in place of
‘association’ is A0 | Comparing their test
and critical values
leading to a
conclusion.
Conclusion in context
3 | (d) | For 16-34 year olds the contribution of 3.3642 suggests
that more are smokers than would be expected.
For 35-59 year olds things are (approximately) as
expected if there were no association.
For people aged 60 and over the contribution of 1.1775
suggests that fewer are smokers than would be
expected. | E1
E1
E1
[3] | 2.3
3.5a
3.2a | Max of 2 marks out of 3 if no
contributions are mentioned.
Allow equivalent statements about
non-smokers | Should take each age
group in turn and
discuss status
Max 2 marks if done
differently
3 A student wants to know whether there is any association between age and whether or not people smoke. The student takes a sample of 120 adults and asks each of them whether or not they smoke. Below is a screenshot showing part of a spreadsheet used to analyse the data. Some values in the spreadsheet have been deliberately omitted.
\begin{center}
\begin{tabular}{|l|l|l|l|l|l|}
\hline
& A & B & C & D & E \\
\hline
1 & \multicolumn{2}{|c|}{\multirow{3}{*}{}} & \multicolumn{3}{|c|}{Observed frequency} \\
\hline
2 & & & \multicolumn{3}{|c|}{Age} \\
\hline
3 & & & 16-34 & 35-59 & 60 and over \\
\hline
4 & \multirow{2}{*}{Smoking status} & Smoker & 13 & 7 & 3 \\
\hline
5 & & Non-smoker & 28 & 43 & 26 \\
\hline
6 & & & \multicolumn{3}{|c|}{} \\
\hline
7 & & & \multicolumn{3}{|c|}{Expected frequency} \\
\hline
8 & & & 7.8583 & & \\
\hline
9 & & & 33.1417 & & \\
\hline
10 & & & \multicolumn{3}{|c|}{} \\
\hline
11 & & & \multicolumn{3}{|c|}{Contributions to the test statistic} \\
\hline
12 & & & 3.3642 & 0.6964 & 1.1775 \\
\hline
13 & & & & 0.1651 & 0.2792 \\
\hline
11 & & & & & \\
\hline
\end{tabular}
\end{center}
\begin{enumerate}[label=(\alph*)]
\item The student wants to carry out a chi-squared test to analyse the data.
State a requirement of the sample if the test is to be valid.
For the rest of this question, you should assume that this requirement is met.
\item Determine the missing values in each of the following cells.
\begin{itemize}
\item E8
\item C13
\item In this question you must show detailed reasoning.
\end{itemize}
Carry out a hypothesis test at the $5 \%$ significance level to investigate whether there is any association between age and smoking status.
\item Discuss what the data suggest about the smoking status for each different age group.
\end{enumerate}
\hfill \mbox{\textit{OCR MEI Further Statistics Minor 2021 Q3 [13]}}