| Exam Board | OCR MEI |
|---|---|
| Module | Further Statistics Major (Further Statistics Major) |
| Year | 2019 |
| Session | June |
| Marks | 13 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Chi-squared goodness of fit |
| Type | Spreadsheet-based chi-squared test |
| Difficulty | Standard +0.3 This is a straightforward chi-squared test question with standard calculations. Part (a) requires basic expected frequency calculations using row/column totals (routine formula application). Part (b) is a standard hypothesis test requiring summing contributions and comparing to critical value. Part (c) asks for interpretation of residuals, which is standard practice. All steps are textbook procedures with no novel insight required, making it slightly easier than average. |
| Spec | 5.06a Chi-squared: contingency tables5.06b Fit prescribed distribution: chi-squared test |
| A | B | C | D | E | F | |
| 1 | Observed frequencies | |||||
| 2 | Underweight | Normal | Overweight | Totals | ||
| 3 | Non-smoker | 8 | 52 | 178 | 238 | |
| 4 | Light smoker | 10 | 40 | 68 | 118 | |
| 5 | Heavy smoker | 5 | 47 | 92 | 144 | |
| 6 | Totals | 23 | 139 | 338 | 500 | |
| 7 | ||||||
| 8 | Expected frequencies | |||||
| 9 | Non-smoker | 10.9480 | 66.1640 | 160.8880 | ||
| 10 | Light smoker | 5.4280 | 79.7680 | |||
| 11 | Heavy smoker | 40.0320 | 97.3440 | |||
| 12 | ||||||
| 13 | ||||||
| 14 | Non-smoker | 0.7938 | 1.8200 | |||
| 15 | Light smoker | 3.8510 | 1.5785 | 1.7361 | ||
| 16 | Heavy smoker | 0.3982 | 1.2129 | 0.2934 | ||
| 17 | ||||||
| Answer | Marks | Guidance |
|---|---|---|
| 5 | (a) | 14423 |
| Answer | Marks |
|---|---|
| 66.164 | B1 |
| Answer | Marks |
|---|---|
| [4] | Allow these to be found by subtraction |
| Answer | Marks | Guidance |
|---|---|---|
| 5 | (b) | H : no association between smoking status and |
| Answer | Marks |
|---|---|
| weight | B1 |
| Answer | Marks |
|---|---|
| [6] | For both |
| Answer | Marks |
|---|---|
| MAX B0B1B1B1M0A0 | Allow |
| Answer | Marks | Guidance |
|---|---|---|
| 5 | (c) | For non-smokers the contribution of 3.0321 shows |
| Answer | Marks |
|---|---|
| weight | B1 |
| Answer | Marks |
|---|---|
| [3] | Do NOT allow ‘are more underweight |
Question 5:
5 | (a) | 14423
B11 6.6240
500
118139
C10 32.8040
500
(5266.164)2
C14 3.0321
66.164 | B1
B1
M1
A1
[4] | Allow these to be found by subtraction
from row or column totals
(OE)2
For used
E
5 | (b) | H : no association between smoking status and
0
weight
H : some association between smoking status and
1
weight
Degrees of freedom = 4
Critical value = 13.28
Test statistic = 14.716
14.716 > 13.28 so reject H
0
There is sufficient evidence to suggest that there is
some association between smoking status and
weight | B1
B1
B1
B1
M1
A1
[6] | For both
Do NOT allow ‘relationship’ in place of
association
FT their test statistic provided that
critical value is correct.
Do NOT allow ‘relationship’ here
If hypotheses the wrong way around
MAX B0B1B1B1M0A0 | Allow
independent, not
independent
5 | (c) | For non-smokers the contribution of 3.0321 shows
that rather fewer than expected are normal weight
For light smokers the contribution of 3.8510 shows
that more than expected are underweight
For heavy smokers the contribution of 1.2129
shows that rather more than expected are of normal
weight | B1
B1
B1
[3] | Do NOT allow ‘are more underweight
than expected’
5 In an investigation into the possible relationship between smoking and weight in adults in a particular country, a researcher selected a random sample of 500 adults.\\
The adults in the sample were classified according to smoking status (non-smoker, light smoker or heavy smoker, where light smoker indicates less than 10 cigarettes per day) and body weight (underweight, normal weight or overweight).
Fig. 5 is a screenshot showing part of the spreadsheet used to calculate the contributions for a chisquared test. Some values in the spreadsheet have been deliberately omitted.
\begin{table}[h]
\begin{center}
\begin{tabular}{|l|l|l|l|l|l|l|}
\hline
& A & B & C & D & E & F \\
\hline
1 & \multicolumn{6}{|c|}{Observed frequencies} \\
\hline
2 & & Underweight & Normal & Overweight & Totals & \\
\hline
3 & Non-smoker & 8 & 52 & 178 & 238 & \\
\hline
4 & Light smoker & 10 & 40 & 68 & 118 & \\
\hline
5 & Heavy smoker & 5 & 47 & 92 & 144 & \\
\hline
6 & Totals & 23 & 139 & 338 & 500 & \\
\hline
7 & & & & & & \\
\hline
8 & \multicolumn{4}{|c|}{Expected frequencies} & & \\
\hline
9 & Non-smoker & 10.9480 & 66.1640 & 160.8880 & & \\
\hline
10 & Light smoker & 5.4280 & & 79.7680 & & \\
\hline
11 & Heavy smoker & & 40.0320 & 97.3440 & & \\
\hline
12 & & & & & & \\
\hline
13 & \multicolumn{6}{|c|}{} \\
\hline
14 & Non-smoker & 0.7938 & & 1.8200 & & \\
\hline
15 & Light smoker & 3.8510 & 1.5785 & 1.7361 & & \\
\hline
16 & Heavy smoker & 0.3982 & 1.2129 & 0.2934 & & \\
\hline
17 & & & & & & \\
\hline
\end{tabular}
\captionsetup{labelformat=empty}
\caption{Fig. 5}
\end{center}
\end{table}
\begin{enumerate}[label=(\alph*)]
\item Showing your calculations, find the missing values in each of the following cells.
\begin{itemize}
\item B11
\item C10
\item C14
\item Complete the hypothesis test at the $1 \%$ level of significance.
\item For each smoking status, give a brief interpretation of the largest of the three contributions to the test statistic.
\end{itemize}
\end{enumerate}
\hfill \mbox{\textit{OCR MEI Further Statistics Major 2019 Q5 [13]}}