OCR MEI Further Statistics Minor 2024 June — Question 4 12 marks

Exam BoardOCR MEI
ModuleFurther Statistics Minor (Further Statistics Minor)
Year2024
SessionJune
Marks12
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicChi-squared test of independence
TypeLarger contingency table (4+ categories)
DifficultyModerate -0.3 This is a standard chi-squared test of independence with straightforward calculations. Part (a) is basic statistical reasoning, part (b) requires simple expected frequency calculation using row/column totals, and part (c) involves comparing a given test statistic to critical values. The test statistic is provided, eliminating the most tedious calculation. This is slightly easier than average as it's a routine application of a standard technique with no conceptual challenges.
Spec2.01a Population and sample: terminology5.06a Chi-squared: contingency tables

4 A genetics researcher is investigating whether there is any association between natural hair colour and natural eye colour. A random sample of 800 adults is selected. Each adult can categorise their natural hair colour as blonde, brown, black or red and their natural eye colour as brown, blue or green.
  1. Explain the benefit of using a random sample in this investigation. The data collected from the sample are summarised in Table 4.1. \begin{table}[h]
    \captionsetup{labelformat=empty} \caption{Table 4.1}
    \multirow{2}{*}{Observed frequency}Hair Colour
    BlondeBrownBlackRedTotal
    \multirow{3}{*}{Eye Colour}Brown4715319636432
    Blue617811526280
    Green1922311688
    Total12725334278800
    \end{table} The researcher decides to carry out a chi-squared test.
  2. Determine the expected frequencies for each eye colour in the blonde hair category. You are given that the test statistic is 28.62 to 2 decimal places.
  3. Carry out the chi-squared test at the 10\% significance level. Table 4.2 shows the chi-squared contributions for some of the categories. The contributions for the categories relating to green eye colour have been deliberately omitted. \begin{table}[h]
    \captionsetup{labelformat=empty} \caption{Table 4.2}
    Hair Colour
    \cline { 2 - 6 }BlondeBrownBlackRed
    \multirow{3}{*}{
    Eye
    Colour
    }
    Brown6.7911.9640.6940.889
    \cline { 2 - 6 }Blue6.1621.2570.1850.062
    \cline { 2 - 6 }Green
    \end{table}
  4. Calculate the chi-squared contribution for the green eye and blonde hair category.
  5. With reference to the values in Table 4.2, discuss what the data suggest about brown eye colour and blue eye colour for people with blonde hair.
  6. A different researcher, carrying out the same investigation, independently takes a different random sample of size 800 and performs the same hypothesis test, but at the 1\% significance level, reaching the same conclusion as the original test. By comparing only the significance level of the two tests, specify which test, the one at the 10\% significance level or the one at the 1\% significance level, provides stronger evidence for the conclusion. Justify your answer.

  7. Question 4:
    AnswerMarks Guidance
    4(a) If a sample is random then it is valid to draw
    (statistical) inferences from itB1
    [1]2.4 No context necessary; just the
    ideas that if random then the
    sample probably represents the
    population and if this is so then
    conclusions we draw are likely to
    AnswerMarks
    be valid.Needs a reference to the purpose of
    the sample e.g. inference,
    investigation, analysis, test, statistic,
    conclusion…, and a word that
    qualifies validity e.g. unbiased,
    proper, accurate
    AnswerMarks Guidance
    4(b) 800(127/800)(m/800) or 127(m/800) where
    m = 432, 280 or 88.
    AnswerMarks
    Brown: 68.58, Blue 44.45, Green 13.97M1
    A1
    AnswerMarks
    [2]1.1
    1.1Showing any one correct
    calculation for expected frequency.
    For this mark condone any
    confusion between eye colours
    AnswerMarks
    At least 3 sfIf no working shown, all three must be
    correct to at least 3 s.f. for both marks
    AnswerMarks Guidance
    4(c) H : There is no association between hair colour
    0
    and eye colour
    and
    H : There is some association between hair
    1
    colour and eye colour
     = (4 – 1)(3 – 1) = 6
    (2 ) = 10.64
    6 10%
    28.62 > 10.64 so H is rejected
    0
    There is sufficient evidence (at the 10% level) to
    suggest that there is some association between
    AnswerMarks
    (natural) hair colour and (natural) eye colourB1
    B1
    B1
    M1
    A1
    AnswerMarks
    [5]3.3
    1.1
    3.4
    1.1
    AnswerMarks
    2.2bor
    H : Hair colour and eye colour are
    0
    independent
    H : Hair colour and eye colour are
    1
    not independent
    Making correct comparison
    between given value and their CV
    and drawing consistent inference
    Non-assertive, contextual
    conclusion from correct critical
    AnswerMarks
    valuep-value is 7.18 (or 7.19)10–5 < 0.1 so
    reject H
    0
    AnswerMarks Guidance
    4(d) (19 – 13.97)2 / 13.97 = 1.811
    [1]1.1 FT Their expected value from (b).
    1.786 comes from using 14 or 14.0.
    AnswerMarks Guidance
    4(e) The high levels of the (2-) contributions implies
    that the number of blond people with blue eyes is
    different/higher than expected and the number
    of blond people with brown eyes is
    different/lower than expected
    The fact that 61 > 44.45 suggests that more
    people with blonde hair have blue eyes than
    would be expected (if there were no association),
    and 47 < 68.58 suggests fewer people with blond
    AnswerMarks
    hair have brown eyes than expected.B1
    B1
    AnswerMarks
    [2]3.5a
    3.5aIgnore comments about blonde
    hair/green eyes.
    If B0B0 then SC1 for fewer people
    with blonde hair have brown eyes than
    would be expected and more people
    with blonde hair have blue eyes than
    expected provided 61 > 44.45 and 47
    < 68.58 is quoted
    AnswerMarks Guidance
    4(f) The test at the 1% significance level since the
    test statistic exceeding the critical value is less
    likely to have been caused by random factors
    AnswerMarks Guidance
    (i.e. if the null hypothesis is true)B1
    [1]3.5a Or the chance that H is rejected
    0
    when true is lower, or the chance
    of a false positive is less.
    Need a comparative like less or
    AnswerMarks
    lowerIf the conclusion to (c) is that there is
    no association then B1 can be awarded
    for “The test at the 10% level since if
    there is a small association it is more
    likely be considered significant by this
    test so the fact that this test did not
    reject H is more informative” oe
    0
    Question 4:
    4 | (a) | If a sample is random then it is valid to draw
    (statistical) inferences from it | B1
    [1] | 2.4 | No context necessary; just the
    ideas that if random then the
    sample probably represents the
    population and if this is so then
    conclusions we draw are likely to
    be valid. | Needs a reference to the purpose of
    the sample e.g. inference,
    investigation, analysis, test, statistic,
    conclusion…, and a word that
    qualifies validity e.g. unbiased,
    proper, accurate
    4 | (b) | 800(127/800)(m/800) or 127(m/800) where
    m = 432, 280 or 88.
    Brown: 68.58, Blue 44.45, Green 13.97 | M1
    A1
    [2] | 1.1
    1.1 | Showing any one correct
    calculation for expected frequency.
    For this mark condone any
    confusion between eye colours
    At least 3 sf | If no working shown, all three must be
    correct to at least 3 s.f. for both marks
    4 | (c) | H : There is no association between hair colour
    0
    and eye colour
    and
    H : There is some association between hair
    1
    colour and eye colour
     = (4 – 1)(3 – 1) = 6
    (2 ) = 10.64
    6 10%
    28.62 > 10.64 so H is rejected
    0
    There is sufficient evidence (at the 10% level) to
    suggest that there is some association between
    (natural) hair colour and (natural) eye colour | B1
    B1
    B1
    M1
    A1
    [5] | 3.3
    1.1
    3.4
    1.1
    2.2b | or
    H : Hair colour and eye colour are
    0
    independent
    H : Hair colour and eye colour are
    1
    not independent
    Making correct comparison
    between given value and their CV
    and drawing consistent inference
    Non-assertive, contextual
    conclusion from correct critical
    value | p-value is 7.18 (or 7.19)10–5 < 0.1 so
    reject H
    0
    4 | (d) | (19 – 13.97)2 / 13.97 = 1.811 | B1FT
    [1] | 1.1 | FT Their expected value from (b). | Answer should be quoted to 4 sf.
    1.786 comes from using 14 or 14.0.
    4 | (e) | The high levels of the (2-) contributions implies
    that the number of blond people with blue eyes is
    different/higher than expected and the number
    of blond people with brown eyes is
    different/lower than expected
    The fact that 61 > 44.45 suggests that more
    people with blonde hair have blue eyes than
    would be expected (if there were no association),
    and 47 < 68.58 suggests fewer people with blond
    hair have brown eyes than expected. | B1
    B1
    [2] | 3.5a
    3.5a | Ignore comments about blonde
    hair/green eyes.
    If B0B0 then SC1 for fewer people
    with blonde hair have brown eyes than
    would be expected and more people
    with blonde hair have blue eyes than
    expected provided 61 > 44.45 and 47
    < 68.58 is quoted
    4 | (f) | The test at the 1% significance level since the
    test statistic exceeding the critical value is less
    likely to have been caused by random factors
    (i.e. if the null hypothesis is true) | B1
    [1] | 3.5a | Or the chance that H is rejected
    0
    when true is lower, or the chance
    of a false positive is less.
    Need a comparative like less or
    lower | If the conclusion to (c) is that there is
    no association then B1 can be awarded
    for “The test at the 10% level since if
    there is a small association it is more
    likely be considered significant by this
    test so the fact that this test did not
    reject H is more informative” oe
    0
    4 A genetics researcher is investigating whether there is any association between natural hair colour and natural eye colour. A random sample of 800 adults is selected. Each adult can categorise their natural hair colour as blonde, brown, black or red and their natural eye colour as brown, blue or green.
    \begin{enumerate}[label=(\alph*)]
    \item Explain the benefit of using a random sample in this investigation.
    
    The data collected from the sample are summarised in Table 4.1.
    
    \begin{table}[h]
    \begin{center}
    \captionsetup{labelformat=empty}
    \caption{Table 4.1}
    \begin{tabular}{|l|l|l|l|l|l|l|}
    \hline
    \multicolumn{2}{|c|}{\multirow{2}{*}{Observed frequency}} & \multicolumn{4}{|c|}{Hair Colour} &  \\
    \hline
     &  & Blonde & Brown & Black & Red & Total \\
    \hline
    \multirow{3}{*}{Eye Colour} & Brown & 47 & 153 & 196 & 36 & 432 \\
    \hline
     & Blue & 61 & 78 & 115 & 26 & 280 \\
    \hline
     & Green & 19 & 22 & 31 & 16 & 88 \\
    \hline
     & Total & 127 & 253 & 342 & 78 & 800 \\
    \hline
    \end{tabular}
    \end{center}
    \end{table}
    
    The researcher decides to carry out a chi-squared test.
    \item Determine the expected frequencies for each eye colour in the blonde hair category.
    
    You are given that the test statistic is 28.62 to 2 decimal places.
    \item Carry out the chi-squared test at the 10\% significance level.
    
    Table 4.2 shows the chi-squared contributions for some of the categories. The contributions for the categories relating to green eye colour have been deliberately omitted.
    
    \begin{table}[h]
    \begin{center}
    \captionsetup{labelformat=empty}
    \caption{Table 4.2}
    \begin{tabular}{ | c | l | c | c | c | c | }
    \hline
    \multicolumn{2}{|c|}{\begin{tabular}{ c }
    Chi-squared \\
    contributions \\
    \end{tabular}} & \multicolumn{4}{|c|}{Hair Colour} \\
    \cline { 2 - 6 }
     & Blonde & Brown & Black & Red &  \\
    \hline
    \multirow{3}{*}{\begin{tabular}{ c }
    Eye \\
    Colour \\
    \end{tabular}} & Brown & 6.791 & 1.964 & 0.694 & 0.889 \\
    \cline { 2 - 6 }
     & Blue & 6.162 & 1.257 & 0.185 & 0.062 \\
    \cline { 2 - 6 }
     & Green &  &  &  &  \\
    \hline
    \end{tabular}
    \end{center}
    \end{table}
    \item Calculate the chi-squared contribution for the green eye and blonde hair category.
    \item With reference to the values in Table 4.2, discuss what the data suggest about brown eye colour and blue eye colour for people with blonde hair.
    \item A different researcher, carrying out the same investigation, independently takes a different random sample of size 800 and performs the same hypothesis test, but at the 1\% significance level, reaching the same conclusion as the original test.
    
    By comparing only the significance level of the two tests, specify which test, the one at the 10\% significance level or the one at the 1\% significance level, provides stronger evidence for the conclusion. Justify your answer.
    \end{enumerate}
    
    \hfill \mbox{\textit{OCR MEI Further Statistics Minor 2024 Q4 [12]}}