OCR MEI Further Statistics Major 2019 June — Question 5 13 marks

Exam BoardOCR MEI
ModuleFurther Statistics Major (Further Statistics Major)
Year2019
SessionJune
Marks13
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicChi-squared goodness of fit
TypeSpreadsheet-based chi-squared test
DifficultyStandard +0.3 This is a straightforward chi-squared test question with standard calculations. Part (a) requires basic expected frequency calculations using row/column totals (routine formula application). Part (b) is a standard hypothesis test requiring summing contributions and comparing to critical value. Part (c) asks for interpretation of residuals, which is standard practice. All steps are textbook procedures with no novel insight required, making it slightly easier than average.
Spec5.06a Chi-squared: contingency tables5.06b Fit prescribed distribution: chi-squared test

5 In an investigation into the possible relationship between smoking and weight in adults in a particular country, a researcher selected a random sample of 500 adults.
The adults in the sample were classified according to smoking status (non-smoker, light smoker or heavy smoker, where light smoker indicates less than 10 cigarettes per day) and body weight (underweight, normal weight or overweight). Fig. 5 is a screenshot showing part of the spreadsheet used to calculate the contributions for a chisquared test. Some values in the spreadsheet have been deliberately omitted. \begin{table}[h]
ABCDEF
1Observed frequencies
2UnderweightNormalOverweightTotals
3Non-smoker852178238
4Light smoker104068118
5Heavy smoker54792144
6Totals23139338500
7
8Expected frequencies
9Non-smoker10.948066.1640160.8880
10Light smoker5.428079.7680
11Heavy smoker40.032097.3440
12
13
14Non-smoker0.79381.8200
15Light smoker3.85101.57851.7361
16Heavy smoker0.39821.21290.2934
17
\captionsetup{labelformat=empty} \caption{Fig. 5}
\end{table}
  1. Showing your calculations, find the missing values in each of the following cells.

Question 5:
AnswerMarks Guidance
5(a) 14423
B11 6.6240
500
118139
C10 32.8040
500
(5266.164)2
C14 3.0321
AnswerMarks
66.164B1
B1
M1
A1
AnswerMarks
[4]Allow these to be found by subtraction
from row or column totals
(OE)2
For used
E
AnswerMarks Guidance
5(b) H : no association between smoking status and
0
weight
H : some association between smoking status and
1
weight
Degrees of freedom = 4
Critical value = 13.28
Test statistic = 14.716
14.716 > 13.28 so reject H
0
There is sufficient evidence to suggest that there is
some association between smoking status and
AnswerMarks
weightB1
B1
B1
B1
M1
A1
AnswerMarks
[6]For both
Do NOT allow ‘relationship’ in place of
association
FT their test statistic provided that
critical value is correct.
Do NOT allow ‘relationship’ here
If hypotheses the wrong way around
AnswerMarks
MAX B0B1B1B1M0A0Allow
independent, not
independent
AnswerMarks Guidance
5(c) For non-smokers the contribution of 3.0321 shows
that rather fewer than expected are normal weight
For light smokers the contribution of 3.8510 shows
that more than expected are underweight
For heavy smokers the contribution of 1.2129
shows that rather more than expected are of normal
AnswerMarks
weightB1
B1
B1
AnswerMarks
[3]Do NOT allow ‘are more underweight
than expected’
Question 5:
5 | (a) | 14423
B11 6.6240
500
118139
C10 32.8040
500
(5266.164)2
C14 3.0321
66.164 | B1
B1
M1
A1
[4] | Allow these to be found by subtraction
from row or column totals
(OE)2
For used
E
5 | (b) | H : no association between smoking status and
0
weight
H : some association between smoking status and
1
weight
Degrees of freedom = 4
Critical value = 13.28
Test statistic = 14.716
14.716 > 13.28 so reject H
0
There is sufficient evidence to suggest that there is
some association between smoking status and
weight | B1
B1
B1
B1
M1
A1
[6] | For both
Do NOT allow ‘relationship’ in place of
association
FT their test statistic provided that
critical value is correct.
Do NOT allow ‘relationship’ here
If hypotheses the wrong way around
MAX B0B1B1B1M0A0 | Allow
independent, not
independent
5 | (c) | For non-smokers the contribution of 3.0321 shows
that rather fewer than expected are normal weight
For light smokers the contribution of 3.8510 shows
that more than expected are underweight
For heavy smokers the contribution of 1.2129
shows that rather more than expected are of normal
weight | B1
B1
B1
[3] | Do NOT allow ‘are more underweight
than expected’
5 In an investigation into the possible relationship between smoking and weight in adults in a particular country, a researcher selected a random sample of 500 adults.\\
The adults in the sample were classified according to smoking status (non-smoker, light smoker or heavy smoker, where light smoker indicates less than 10 cigarettes per day) and body weight (underweight, normal weight or overweight).

Fig. 5 is a screenshot showing part of the spreadsheet used to calculate the contributions for a chisquared test. Some values in the spreadsheet have been deliberately omitted.

\begin{table}[h]
\begin{center}
\begin{tabular}{|l|l|l|l|l|l|l|}
\hline
 & A & B & C & D & E & F \\
\hline
1 & \multicolumn{6}{|c|}{Observed frequencies} \\
\hline
2 &  & Underweight & Normal & Overweight & Totals &  \\
\hline
3 & Non-smoker & 8 & 52 & 178 & 238 &  \\
\hline
4 & Light smoker & 10 & 40 & 68 & 118 &  \\
\hline
5 & Heavy smoker & 5 & 47 & 92 & 144 &  \\
\hline
6 & Totals & 23 & 139 & 338 & 500 &  \\
\hline
7 &  &  &  &  &  &  \\
\hline
8 & \multicolumn{4}{|c|}{Expected frequencies} &  &  \\
\hline
9 & Non-smoker & 10.9480 & 66.1640 & 160.8880 &  &  \\
\hline
10 & Light smoker & 5.4280 &  & 79.7680 &  &  \\
\hline
11 & Heavy smoker &  & 40.0320 & 97.3440 &  &  \\
\hline
12 &  &  &  &  &  &  \\
\hline
13 & \multicolumn{6}{|c|}{} \\
\hline
14 & Non-smoker & 0.7938 &  & 1.8200 &  &  \\
\hline
15 & Light smoker & 3.8510 & 1.5785 & 1.7361 &  &  \\
\hline
16 & Heavy smoker & 0.3982 & 1.2129 & 0.2934 &  &  \\
\hline
17 &  &  &  &  &  &  \\
\hline
\end{tabular}
\captionsetup{labelformat=empty}
\caption{Fig. 5}
\end{center}
\end{table}
\begin{enumerate}[label=(\alph*)]
\item Showing your calculations, find the missing values in each of the following cells.

\begin{itemize}
  \item B11
  \item C10
  \item C14
\item Complete the hypothesis test at the $1 \%$ level of significance.
\item For each smoking status, give a brief interpretation of the largest of the three contributions to the test statistic.
\end{itemize}
\end{enumerate}

\hfill \mbox{\textit{OCR MEI Further Statistics Major 2019 Q5 [13]}}