Edexcel S3 2022 January — Question 4 14 marks

Exam BoardEdexcel
ModuleS3 (Statistics 3)
Year2022
SessionJanuary
Marks14
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicChi-squared test of independence
TypeStandard 2×3 contingency table
DifficultyStandard +0.3 This is a standard chi-squared test of independence with straightforward calculations (2×3 table, clear expected frequencies). Part (a) tests basic sampling knowledge, part (b) is routine application of the test procedure, and part (c) requires understanding how sample size affects the test statistic—all well-rehearsed S3 content with no novel problem-solving required.
Spec2.01c Sampling techniques: simple random, opportunity, etc5.06a Chi-squared: contingency tables

  1. A survey was carried out with students that had studied Maths, Physics and Chemistry at a college between 2016 and 2020. The students were divided into two groups \(A\) and \(B\).
    1. Explain how a sample could be obtained from this population using quota sampling.
    The students were asked which of the three subjects they enjoyed the most. The results of the survey are shown in the table.
    \multirow{2}{*}{}Subject enjoyed the most
    MathsPhysicsChemistryTotal
    Group A16101339
    Group B38131061
    Total542323100
  2. Test, at the \(5 \%\) level of significance, whether the subject enjoyed the most is independent of group. You should state your hypotheses, expected frequencies, test statistic and the critical value used for this test. The Headteacher discovered later that the results were actually based on a random sample of 200 students but had been recorded in the table as percentages.
  3. For the test in part (b), state with reasons the effect, if any, that this information would have on
    1. the null and alternative hypotheses,
    2. the critical value,
    3. the value of the test statistic,
    4. the conclusion of the test.

Question 4:
Part (a):
AnswerMarks Guidance
Answer/WorkingMark Guidance
Non random sampling/description of non random samplingB1 Correct statement referring to non-random sampling or description of a non-random method for selecting participants e.g. choosing people as they leave school. Do not allow labelling or numbering
From (different groups of the) population until each quota has been metB1 Correct statement referring to selection from different groups until quota is filled
Part (b):
AnswerMarks Guidance
Answer/WorkingMark Guidance
\(H_0\): Subject enjoyed most and group are independent; \(H_1\): Subject enjoyed most and group are not independentB1 Both hypotheses correct. Must mention "Subject" and "group" at least once. May be written in terms of association
Expected values: Group A: Maths 21.06, Physics 8.97, Chemistry 8.97; Group B: Maths 32.94, Physics 14.03, Chemistry 14.03M1 Some attempt at \(\frac{(\text{Row Total}) \times (\text{Column Total})}{\text{Grand Total}}\). Can be implied by at least one correct \(E_i\) to 1 dp
\(\frac{(O-E)^2}{E}\) values: 1.215745, 0.118272, 1.81058, 0.77728, 0.075617, 1.157584dM1 Dependent on 1st M1 for at least 2 correct terms for \(\frac{(O-E)^2}{E}\) or \(\frac{O^2}{E}\) or correct expressions with their \(E_i\). Accept 2 sf accuracy
\(X^2 = \sum\frac{(O-E)^2}{E}\) or \(\sum\frac{O^2}{E} - 100\)dM1 Dependent on 2nd M1 for applying formula
\(= 5.155\ldots\) awrt 5.16 or awrt 5.15A1 awrt 5.16
\(\nu = (3-1)(2-1) = 2\)B1 \(\nu = 2\) may be implied by correct critical value of 5.991
\(\chi^2_2(0.05) = 5.991\)B1ft 5.991; allow ft from stated degrees of freedom
Not in CR/not significant/do not reject \(H_0\). There is not sufficient evidence to suggest that subject enjoyed and group are not independentA1 Dependent on 3rd M1 and 3rd B1. Correct contextualised conclusion not rejecting \(H_0\). Must mention subject and group. Contradictory statements score A0. If no hypotheses or hypotheses wrong way round do not award.
Part (c):
AnswerMarks Guidance
Answer/WorkingMark Guidance
(i) No change (as the test is still the same)B1 A correct statement
(ii) No change (as \(\nu = 2\) still)B1 A correct statement
(iii) Test statistic would double (\(= 10.310\ldots\)) as all observed and expected values are doubledB1 A correct statement which must state that the test statistic doubles
(iv) Conclusion is the opposite (there is sufficient evidence to suggest subject enjoyed and group are not independent) as test statistic is now greater than the critical value (\(10.31 > 5.991\))B1 A correct statement with correct reasoning
# Question 4:

## Part (a):
| Answer/Working | Mark | Guidance |
|---|---|---|
| Non random sampling/description of non random sampling | B1 | Correct statement referring to non-random sampling or description of a non-random method for selecting participants e.g. choosing people as they leave school. Do not allow labelling or numbering |
| From (different groups of the) population until each quota has been met | B1 | Correct statement referring to selection from different groups until quota is filled |

## Part (b):
| Answer/Working | Mark | Guidance |
|---|---|---|
| $H_0$: Subject enjoyed most and group are independent; $H_1$: Subject enjoyed most and group are not independent | B1 | Both hypotheses correct. Must mention "Subject" and "group" at least once. May be written in terms of association |
| Expected values: Group A: Maths 21.06, Physics 8.97, Chemistry 8.97; Group B: Maths 32.94, Physics 14.03, Chemistry 14.03 | M1 | Some attempt at $\frac{(\text{Row Total}) \times (\text{Column Total})}{\text{Grand Total}}$. Can be implied by at least one correct $E_i$ to 1 dp |
| $\frac{(O-E)^2}{E}$ values: 1.215745, 0.118272, 1.81058, 0.77728, 0.075617, 1.157584 | dM1 | Dependent on 1st M1 for at least 2 correct terms for $\frac{(O-E)^2}{E}$ or $\frac{O^2}{E}$ or correct expressions with their $E_i$. Accept 2 sf accuracy |
| $X^2 = \sum\frac{(O-E)^2}{E}$ or $\sum\frac{O^2}{E} - 100$ | dM1 | Dependent on 2nd M1 for applying formula |
| $= 5.155\ldots$ awrt 5.16 or awrt 5.15 | A1 | awrt 5.16 |
| $\nu = (3-1)(2-1) = 2$ | B1 | $\nu = 2$ may be implied by correct critical value of 5.991 |
| $\chi^2_2(0.05) = 5.991$ | B1ft | 5.991; allow ft from stated degrees of freedom |
| Not in CR/not significant/do not reject $H_0$. There is not sufficient evidence to suggest that subject enjoyed and group are not independent | A1 | Dependent on 3rd M1 and 3rd B1. Correct contextualised conclusion not rejecting $H_0$. Must mention subject and group. Contradictory statements score A0. If no hypotheses or hypotheses wrong way round do not award. |

## Part (c):
| Answer/Working | Mark | Guidance |
|---|---|---|
| (i) No change (as the test is still the same) | B1 | A correct statement |
| (ii) No change (as $\nu = 2$ still) | B1 | A correct statement |
| (iii) Test statistic would double ($= 10.310\ldots$) as all observed and expected values are doubled | B1 | A correct statement which must state that the test statistic doubles |
| (iv) Conclusion is the opposite (there is sufficient evidence to suggest subject enjoyed and group are not independent) as test statistic is now greater than the critical value ($10.31 > 5.991$) | B1 | A correct statement with correct reasoning |

---
\begin{enumerate}
  \item A survey was carried out with students that had studied Maths, Physics and Chemistry at a college between 2016 and 2020. The students were divided into two groups $A$ and $B$.\\
(a) Explain how a sample could be obtained from this population using quota sampling.
\end{enumerate}

The students were asked which of the three subjects they enjoyed the most. The results of the survey are shown in the table.

\begin{center}
\begin{tabular}{|l|l|l|l|l|}
\hline
\multirow{2}{*}{} & \multicolumn{4}{|c|}{Subject enjoyed the most} \\
\hline
 & Maths & Physics & Chemistry & Total \\
\hline
Group A & 16 & 10 & 13 & 39 \\
\hline
Group B & 38 & 13 & 10 & 61 \\
\hline
Total & 54 & 23 & 23 & 100 \\
\hline
\end{tabular}
\end{center}

(b) Test, at the $5 \%$ level of significance, whether the subject enjoyed the most is independent of group. You should state your hypotheses, expected frequencies, test statistic and the critical value used for this test.

The Headteacher discovered later that the results were actually based on a random sample of 200 students but had been recorded in the table as percentages.\\
(c) For the test in part (b), state with reasons the effect, if any, that this information would have on\\
(i) the null and alternative hypotheses,\\
(ii) the critical value,\\
(iii) the value of the test statistic,\\
(iv) the conclusion of the test.

\hfill \mbox{\textit{Edexcel S3 2022 Q4 [14]}}