5.06a Chi-squared: contingency tables

179 questions

Sort by: Default | Easiest first | Hardest first
AQA S2 2006 June Q4
13 marks Moderate -0.3
4 It is claimed that the area within which a school is situated affects the age profile of the staff employed at that school. In order to investigate this claim, the age profiles of staff employed at two schools with similar academic achievements are compared. Academia High School, situated in a rural community, employs 120 staff whilst Best Manor Grammar School, situated in an inner-city community, employs 80 staff. The percentage of staff within each age group, for each school, is given in the table.
Age
Academia
High School
Best Manor
Grammar School
\(\mathbf { 2 2 - } \mathbf { 3 4 }\)17.540.0
\(\mathbf { 3 5 - } \mathbf { 3 9 }\)60.045.0
\(\mathbf { 4 0 - } \mathbf { 5 9 }\)22.515.0
    1. Form the data into a contingency table suitable for analysis using a \(\chi ^ { 2 }\) distribution.
      (2 marks)
    2. Use a \(\chi ^ { 2 }\) test, at the \(1 \%\) level of significance, to determine whether there is an association between the age profile of the staff employed and the area within which the school is situated.
  1. Interpret your result in part (a)(ii) as it relates to the 22-34 age group.
AQA S2 2008 June Q1
9 marks Standard +0.3
1 It is thought that the incidence of asthma in children is associated with the volume of traffic in the area where they live. Two surveys of children were conducted: one in an area where the volume of traffic was heavy and the other in an area where the volume of traffic was light. For each area, the table shows the number of children in the survey who had asthma and the number who did not have asthma.
\cline { 2 - 4 } \multicolumn{1}{c|}{}AsthmaNo asthmaTotal
Heavy traffic5258110
Light traffic286290
Total80120200
  1. Use a \(\chi ^ { 2 }\) test, at the \(5 \%\) level of significance, to determine whether the incidence of asthma in children is associated with the volume of traffic in the area where they live.
  2. Comment on the number of children in the survey who had asthma, given that they lived in an area where the volume of traffic was heavy.
AQA S2 2011 June Q2
11 marks Moderate -0.3
2
  1. The continuous random variable \(X\) has a rectangular distribution defined by the probability density function $$f ( x ) = \begin{cases} 0.01 \pi & u \leqslant x \leqslant 11 u \\ 0 & \text { otherwise } \end{cases}$$ where \(u\) is a constant.
    1. Show that \(u = \frac { 10 } { \pi }\).
    2. Using the formulae for the mean and the variance of a rectangular distribution, find, in terms of \(\pi\), values for \(\mathrm { E } ( X )\) and \(\operatorname { Var } ( X )\).
    3. Calculate exact values for the mean and the variance of the circumferences of circles having diameters of length \(\left( X + \frac { 10 } { \pi } \right)\).
  2. A machine produces circular discs which have an area of \(Y \mathrm {~cm} ^ { 2 }\). The distribution of \(Y\) has mean \(\mu\) and variance 25 . A random sample of 100 such discs is selected. The mean area of the discs in this sample is calculated to be \(40.5 \mathrm {~cm} ^ { 2 }\). Calculate a 95\% confidence interval for \(\mu\). Emily believed that the performances of 16-year-old students in their GCSEs are associated with the schools that they attend. To investigate her belief, Emily collected data on the GCSE results for 2010 from four schools in her area. The table shows Emily's collected data, denoted by \(O _ { i }\), together with the corresponding expected frequencies, \(E _ { i }\), necessary for a \(\chi ^ { 2 }\) test.
    \multirow{2}{*}{}\(\boldsymbol { \geq } \mathbf { 5 }\) GCSEs\(\mathbf { 1 } \boldsymbol { \leqslant }\) GCSEs < \(\mathbf { 5 }\)No GCSEs
    \(O _ { i }\)\(E _ { i }\)\(O _ { i }\)\(E _ { i }\)\(O _ { i }\)\(E _ { i }\)
    Jolliffe College for the Arts187193.159390.623026.23
    Volpe Science Academy175184.439786.522425.05
    Radok Music School183183.817886.233424.96
    Bailey Language School265248.61112116.632233.76
    Emily used these values to correctly conduct a \(\chi ^ { 2 }\) test at the \(1 \%\) level of significance.
AQA S2 2012 June Q6
11 marks Standard +0.3
6 Fiona, a lecturer in a school of engineering, believes that there is an association between the class of degree obtained by her students and the grades that they had achieved in A-level Mathematics. In order to investigate her belief, she collected the relevant data on the performances of a random sample of 200 recent graduates who had achieved grades A or B in A-level Mathematics. These data are tabulated below.
\multirow{2}{*}{}Class of degree
12(i)2(ii)3Total
\multirow{2}{*}{A-level grade}A203622280
B955488120
Total29917010200
  1. Conduct a \(\chi ^ { 2 }\) test, at the \(1 \%\) level of significance, to determine whether Fiona's belief is justified.
  2. Make two comments on the degree performance of those students in this sample who achieved a grade B in A-level Mathematics.
AQA S2 2013 June Q2
10 marks Standard +0.3
2 A town council wanted residents to apply for grants that were available for home insulation. In a trial, a random sample of 200 residents was encouraged, either in a letter or by a phone call, to apply for the grants. The outcomes are shown in the table.
Applied for grantDid not apply for grantTotal
Letter30130160
Phone call142640
Total44156200
  1. The council believed that a phone call was more effective than a letter in encouraging people to apply for a grant. Use a \(\chi ^ { 2 }\)-test to investigate this belief at the \(5 \%\) significance level.
  2. After the trial, all the residents in the town were encouraged, either in a letter or by a phone call, to apply for the grants. It was found that there was no association between the method of encouragement and the outcome. State, with a reason, whether a Type I error, a Type II error or neither occurred in carrying out the test in part (a).
    (2 marks)
AQA S2 2014 June Q2
11 marks Moderate -0.3
2 A large multinational company recruits employees from all four countries in the UK. For a sample of 250 recruits, the percentages of males and females from each of the countries are shown in Table 1. \begin{table}[h]
\captionsetup{labelformat=empty} \caption{Table 1}
\cline { 2 - 5 } \multicolumn{1}{c|}{}EnglandScotlandWales
Northern
Ireland
Male22.817.610.86.8
Female15.617.27.61.6
\end{table}
  1. Add the frequencies to the contingency table, Table 2, below.
  2. Carry out a \(\chi ^ { 2 }\)-test at the \(10 \%\) significance level to investigate whether there is an association between country and gender of recruits.
  3. By comparing observed and expected values, make one comment about the distribution of female recruits.
    [0pt] [1 mark] \begin{table}[h]
    \captionsetup{labelformat=empty} \caption{Table 2}
    EnglandScotlandWalesNorthern IrelandTotal
    Male145
    Female105
    Total250
    \end{table}
AQA S2 2015 June Q5
10 marks Standard +0.3
5 In a particular town, a survey was conducted on a sample of 200 residents aged 41 years to 50 years. The survey questioned these residents to discover the age at which they had left full-time education and the greatest rate of income tax that they were paying at the time of the survey. The summarised data obtained from the survey are shown in the table.
\multirow{2}{*}{Greatest rate of income tax paid}Age when leaving education (years)\multirow[b]{2}{*}{Total}
16 or less17 or 1819 or more
Zero323439
Basic1021217131
Higher175830
Total1512029200
  1. Use a \(\chi ^ { 2 }\)-test, at the \(5 \%\) level of significance, to investigate whether there is an association between age when leaving education and greatest rate of income tax paid.
  2. It is believed that residents of this town who had left education at a later age were more likely to be paying the higher rate of income tax. Comment on this belief.
    [0pt] [1 mark]
Edexcel S3 Q4
11 marks Standard +0.3
4. A group of 40 males and 40 females were asked which of three "Reality TV" shows they liked most - Watched, Stranded or One-2-Win. The results were as follows:
\cline { 2 - 4 } \multicolumn{1}{c|}{}WatchedStrandedOne-2-Win
Males21613
Females151015
Stating your hypotheses clearly, test at the \(10 \%\) level whether or not there is a significant difference in the preferences of males and females.
Edexcel S3 Q6
14 marks Standard +0.3
6. A market researcher recorded the number of adverts for vehicles in each of three categories on ITV, Channel 4 and Channel 5 over a period of time. The results are shown in the table below.
ITVChannel 4Channel 5
Family Saloon693528
Sports Car202818
Off-road Vehicle12228
  1. Stating your hypotheses clearly, test at the \(5 \%\) level of significance whether or not there is evidence of the proportion of adverts for each type of vehicle being dependent on the channel.
  2. Suggest a reason for your result in part (a).
Edexcel S3 Q5
13 marks Standard +0.3
5. A Policy Unit wished to find out whether attitudes to the European Union varied with age. It conducted a survey asking 200 individuals to which of three age groups they belonged and whether they regarded themselves as generally pro-Europe or Eurosceptic. The results are shown in the table below.
\cline { 2 - 3 } \multicolumn{1}{c|}{}Pro-EuropeEurosceptic
\(18 - 34\) years4321
\(35 - 54\) years3036
55 years or over2743
  1. Stating your hypotheses clearly, test at the \(5 \%\) level of significance whether attitudes to Europe are associated with age.
    (11 marks)
    The survey also asked people if they voted at the last election. When the above test was repeated using only the results from those who had voted a value of 4.872 was calculated for \(\sum \frac { ( O - E ) ^ { 2 } } { E }\). No classes were combined.
  2. Find if this value leads to a different result.
OCR MEI Further Statistics A AS 2018 June Q5
13 marks Standard +0.3
5 A random sample of workers for a large company were asked whether they are smokers, ex-smokers or have never smoked. The responses were classified by the type of worker: Managerial, Production line or Administrative. Fig. 5 is a screenshot showing part of the spreadsheet used to analyse the data. Some values in the spreadsheet have been deliberately omitted. \begin{table}[h]
ABCDEF
1Observed frequencies
2SmokerEx-smokerNever smokedTotals
3Managerial210517
4Production line18152154
5Administrative1361433
6Totals333140104
7
8Expected frequencies
95.39425.06736.5385
1017.134620.7692
1110.47129.836512.6923
12
13Contributions to the test statistic
142.13584.80170.3620
150.04370.0026
161.49640.1347
17Test statistic9.66
18
\captionsetup{labelformat=empty} \caption{Fig. 5}
\end{table}
  1. (A) State the sample size.
    (B) State the null and alternative hypotheses for a test to investigate whether there is any association between type of worker and smoking status.
  2. Showing your calculations, find the missing values in each of the following cells.
OCR MEI Further Statistics A AS 2022 June Q5
14 marks Standard +0.3
5 A researcher is investigating whether there is any relationship between the overall performance of a student at GCSE and their grade in A Level Mathematics. Their A Level Mathematics grade is classified as A* or A, B, C or lower, and their overall performance at GCSE is classified as Low, Middle, High. Data are collected for a sample of 80 students in a particular area. The researcher carries out a chi-squared test. The screenshot below shows part of a spreadsheet used to analyse the data. Some values in the spreadsheet have been deliberately omitted.
1ABCDE
\multirow{2}{*}{
}Observed frequency
A* or ABC or lowerTotals
3Low613928
4Middle106824
5High1510328
6Totals31292080
7
8\multirow{2}{*}{}
9A* or ABC or lower
10Low10.85
11Middle9.30
12High10.85
13\multirow[b]{2}{*}{Contribution to the test statistic}
14
15A* or ABC or lower
16Low2.16800.80020.5714
17Middle0.05270.83790.6667
18High1.5873
2.2857
2.2857
19
  1. State what needs to be known about the sample for the test to be valid. For the remainder of this question, you should assume that the test is valid.
  2. Determine the missing values in each of the following cells.
    Carry out a hypothesis test at the \(10 \%\) significance level to investigate whether there is any association between level of performance at GCSE and A Level Mathematics grade.
  3. Discuss briefly what the data suggest about A Level Mathematics grade for different levels of performance at GCSE.
  4. State one disadvantage of using a 10\% significance level rather than a 5\% significance level in a hypothesis test.
OCR MEI Further Statistics A AS 2020 November Q6
12 marks Standard +0.3
6 A researcher is investigating whether there is any relationship between whether a cyclist wears a helmet and the distance, \(x \mathrm {~m}\), the cyclist is from the kerb (the edge of the road). Data are collected at a particular location for a random sample of 250 cyclists. The researcher carries out a chi-squared test. Fig. 6 is a screenshot showing part of a spreadsheet used to analyse the data. Some values in the spreadsheet have been deliberately omitted. \begin{table}[h]
ABCDEFG
1\multirow{2}{*}{}Observed frequency
2\(\boldsymbol { x } \boldsymbol { \leq } \mathbf { 0 . 3 }\)\(0.3 < x \leq 0.5\)\(0.5 < x \leq 0.8\)x > 0.8Totals
3\multirow[t]{2}{*}{Wears helmet}Yes26272346122
4No45312131128
5\multirow{2}{*}{}Totals71584477250
6
7Expected frequency
8\(\boldsymbol { x } \boldsymbol { \leq } \mathbf { 0 . 3 }\)\(0.3 < x \leq 0.5\)\(0.5 < x \leq 0.8\)\(\boldsymbol { x } \boldsymbol { > } \mathbf { 0 . 8 }\)
9\multirow[t]{2}{*}{Wears helmet}Yes34.648037.5760
10No36.352039.4240
11
12\multirow{2}{*}{}Contribution to the test statistic
13\(\boldsymbol { x } \boldsymbol { \leq } \mathbf { 0 . 3 }\)\(0.3 < x \leq 0.5\)\(0.5 < x \leq 0.8\)\(\boldsymbol { x } \boldsymbol { > } \mathbf { 0 . 8 }\)
14\multirow[t]{2}{*}{Wears helmet}Yes2.15850.06010.10871.8885
15No2.05730.05731.8000
16
\captionsetup{labelformat=empty} \caption{Fig. 6}
\end{table}
  1. Showing your calculations, find the missing values in each of the following cells.
    Carry out a hypothesis test at the \(10 \%\) significance level to investigate whether there is any association between helmet wearing and distance from the kerb.
  2. Discuss briefly what the data suggest about helmet wearing for different distances from the kerb.
OCR MEI Further Statistics Minor 2024 June Q4
12 marks Moderate -0.3
4 A genetics researcher is investigating whether there is any association between natural hair colour and natural eye colour. A random sample of 800 adults is selected. Each adult can categorise their natural hair colour as blonde, brown, black or red and their natural eye colour as brown, blue or green.
  1. Explain the benefit of using a random sample in this investigation. The data collected from the sample are summarised in Table 4.1. \begin{table}[h]
    \captionsetup{labelformat=empty} \caption{Table 4.1}
    \multirow{2}{*}{Observed frequency}Hair Colour
    BlondeBrownBlackRedTotal
    \multirow{3}{*}{Eye Colour}Brown4715319636432
    Blue617811526280
    Green1922311688
    Total12725334278800
    \end{table} The researcher decides to carry out a chi-squared test.
  2. Determine the expected frequencies for each eye colour in the blonde hair category. You are given that the test statistic is 28.62 to 2 decimal places.
  3. Carry out the chi-squared test at the 10\% significance level. Table 4.2 shows the chi-squared contributions for some of the categories. The contributions for the categories relating to green eye colour have been deliberately omitted. \begin{table}[h]
    \captionsetup{labelformat=empty} \caption{Table 4.2}
    Hair Colour
    \cline { 2 - 6 }BlondeBrownBlackRed
    \multirow{3}{*}{
    Eye
    Colour
    }
    Brown6.7911.9640.6940.889
    \cline { 2 - 6 }Blue6.1621.2570.1850.062
    \cline { 2 - 6 }Green
    \end{table}
  4. Calculate the chi-squared contribution for the green eye and blonde hair category.
  5. With reference to the values in Table 4.2, discuss what the data suggest about brown eye colour and blue eye colour for people with blonde hair.
  6. A different researcher, carrying out the same investigation, independently takes a different random sample of size 800 and performs the same hypothesis test, but at the 1\% significance level, reaching the same conclusion as the original test. By comparing only the significance level of the two tests, specify which test, the one at the 10\% significance level or the one at the 1\% significance level, provides stronger evidence for the conclusion. Justify your answer.
  7. OCR MEI Further Statistics Minor 2021 November Q3
    13 marks Standard +0.3
    3 A student wants to know whether there is any association between age and whether or not people smoke. The student takes a sample of 120 adults and asks each of them whether or not they smoke. Below is a screenshot showing part of a spreadsheet used to analyse the data. Some values in the spreadsheet have been deliberately omitted.
    ABCDE
    1\multirow{3}{*}{}Observed frequency
    2Age
    316-3435-5960 and over
    4\multirow{2}{*}{Smoking status}Smoker1373
    5Non-smoker284326
    6
    7Expected frequency
    87.8583
    933.1417
    10
    11Contributions to the test statistic
    123.36420.69641.1775
    130.16510.2792
    11
    1. The student wants to carry out a chi-squared test to analyse the data. State a requirement of the sample if the test is to be valid. For the rest of this question, you should assume that this requirement is met.
    2. Determine the missing values in each of the following cells.
      Carry out a hypothesis test at the \(5 \%\) significance level to investigate whether there is any association between age and smoking status.
    3. Discuss what the data suggest about the smoking status for each different age group.
    OCR MEI Further Statistics Major 2019 June Q5
    13 marks Standard +0.3
    5 In an investigation into the possible relationship between smoking and weight in adults in a particular country, a researcher selected a random sample of 500 adults.
    The adults in the sample were classified according to smoking status (non-smoker, light smoker or heavy smoker, where light smoker indicates less than 10 cigarettes per day) and body weight (underweight, normal weight or overweight). Fig. 5 is a screenshot showing part of the spreadsheet used to calculate the contributions for a chisquared test. Some values in the spreadsheet have been deliberately omitted. \begin{table}[h]
    ABCDEF
    1Observed frequencies
    2UnderweightNormalOverweightTotals
    3Non-smoker852178238
    4Light smoker104068118
    5Heavy smoker54792144
    6Totals23139338500
    7
    8Expected frequencies
    9Non-smoker10.948066.1640160.8880
    10Light smoker5.428079.7680
    11Heavy smoker40.032097.3440
    12
    13
    14Non-smoker0.79381.8200
    15Light smoker3.85101.57851.7361
    16Heavy smoker0.39821.21290.2934
    17
    \captionsetup{labelformat=empty} \caption{Fig. 5}
    \end{table}
    1. Showing your calculations, find the missing values in each of the following cells.
    OCR MEI Further Statistics Major 2022 June Q10
    13 marks Standard +0.3
    10 A scientist is researching dietary fat intake and cholesterol level. A random sample of 60 people is selected and their dietary fat intakes and cholesterol levels are measured. Dietary fat intakes are classified as low, medium and high, and cholesterol levels are classified as normal and high. The scientist decides to carry out a chi-squared test to investigate whether there is any association between dietary fat intake and cholesterol level. Tables \(\mathbf { 1 0 . 1 }\) and \(\mathbf { 1 0 . 2 }\) show the data and some of the expected frequencies for the test. \begin{table}[h]
    \multirow{2}{*}{}Dietary fat intake
    LowMediumHighTotal
    \multirow{2}{*}{Cholesterol level}Normal918532
    High3131228
    Total12311760
    \captionsetup{labelformat=empty} \caption{Table 10.1}
    \end{table} \begin{table}[h]
    Expected frequencyDietary fat intake
    \cline { 3 - 5 }LowMediumHigh
    \multirow{2}{*}{
    Cholesterol
    level
    }
    Normal9.0667
    \cline { 2 - 5 }High7.9333
    \captionsetup{labelformat=empty} \caption{Table 10.2}
    \end{table}
    1. Complete the table of expected frequencies in the Printed Answer Booklet.
    2. Determine the contribution to the chi-squared test statistic for people with normal cholesterol level and high dietary fat intake, giving your answer to \(\mathbf { 4 }\) decimal places. The contributions to the chi-squared test statistic for the remaining categories are shown in Table 10.3. \begin{table}[h]
      Dietary fat intake
      \cline { 2 - 5 }LowMediumHigh
      \multirow{2}{*}{
      Cholesterol
      level
      }
      Normal1.05630.1301
      \cline { 2 - 5 }High1.20710.14872.0846
      \captionsetup{labelformat=empty} \caption{Table 10.3} \end{table}
    3. In this question you must show detailed reasoning. Carry out the test at the 5\% significance level.
    4. For each level of dietary fat intake, give a brief interpretation of what the data suggest about the level of cholesterol.
    5. OCR MEI Further Statistics Major 2023 June Q9
      10 marks Standard +0.3
      9 A cyclist who lives on an island suspects that car drivers with locally registered number plates allow more space when passing her than those with non-locally registered number plates. She decides to carry out a hypothesis test and so over a period of time selects a random sample of 250 cars which pass her. For each car she estimates whether the car driver allows at least the recommended 1.5 metres when passing her. The table shows the data which she collected.
      Where registered
      \cline { 3 - 4 } \multicolumn{2}{|c|}{}LocalNon-local
      \multirow{2}{*}{
      Passing
      distance
      }
      Under 1.5 m1211
      \cline { 2 - 4 }At least 1.5 m15770
      1. In this question you must show detailed reasoning. Carry out the test at the \(5 \%\) significance level to examine whether there is any association between where the car is registered and passing distance.
      2. A friend of the cyclist suggests that there may be a problem with the data, since the cyclist may have introduced some bias in estimating whether cars were allowing the recommended distance. Explain how any bias might have arisen.
      OCR MEI Further Statistics Major 2024 June Q9
      13 marks Standard +0.3
      9 A cyclist has 3 bicycles, a road bike, a gravel bike and an electric bike. She wishes to know if the bicycle which she is riding makes any difference to whether she reaches a speed of 25 mph or greater on a journey. She selects a random sample of 120 journeys and notes the bicycle and whether or not her maximum speed was 25 mph or greater. She decides to carry out a chisquared test to investigate whether there is any association between bicycle type and whether her maximum speed is 25 mph or greater. Tables 9.1 and 9.2 show the data and some of the expected frequencies for the test. \begin{table}[h]
      \captionsetup{labelformat=empty} \caption{Table 9.1}
      \multirow{2}{*}{}Bicycle
      RoadGravelElectricTotal
      \multirow{2}{*}{Maximum speed}Less than 25 mph2211942
      25 mph or greater13471878
      Total156837120
      \end{table} \begin{table}[h]
      \captionsetup{labelformat=empty} \caption{Table 9.2}
      \multirow{2}{*}{Expected frequency}Bicycle
      RoadGravelElectric
      \multirow{2}{*}{Maximum speed}Less than 25 mph12.95
      25 mph or greater24.05
      \end{table}
      1. Complete the table of expected frequencies in the Printed Answer Booklet.
      2. Determine the contribution to the chi-squared test statistic for the Electric bicycle and maximum speed 25 mph or greater. Give your answer correct to 4 decimal places. The contributions to the chi-squared test statistic for the remaining categories are shown in Table 9.3. \begin{table}[h]
        \captionsetup{labelformat=empty} \caption{Table 9.3}
        \multirow{2}{*}{Contribution to the test statistic}Bicycle
        RoadGravelElectric
        \multirow{2}{*}{Maximum speed}Less than 25 mph2.01190.32942.8264
        25 mph or greater1.08330.1774
        \end{table}
      3. In this question you must show detailed reasoning. Carry out the test at the 5\% significance level.
      4. For each type of bicycle, give a brief interpretation of what the data suggest about maximum speed.
      WJEC Further Unit 2 2019 June Q7
      13 marks Moderate -0.5
      7. An article published in a medical journal investigated sports injuries in adolescents' ball games: football, handball and basketball. In a study of 906 randomly selected adolescent players in the three ball games, 379 players incurred injuries over the course of one year of playing the sport. Rhian wants to test whether there is an association between the site of injury and the sport played. A summary of the injuries is shown in the table below.
      \multirow{2}{*}{}Site of injury
      Observed valuesShoulder/ ArmHand/ FingersThigh/ LegKneeAnkleFootOtherTotal
      \multirow{3}{*}{Sport}Football834536513612191
      Handball14266154266115
      Basketball428442211073
      Total265755551154328379
      1. Calculate the values of \(A , B , C\) in the tables below.
        \multirow{2}{*}{}Site of injury
        Expected valuesShoulder/ ArmHand/ FingersThigh/ LegKneeAnkleFootOther
        \multirow{3}{*}{sodod}Football13.102928.725627.717727.717757.955121.670214.1108
        Handball7.889217.295516.688716.6887A13.04758.4960
        Basketball5.007910.978910.593710.593722.15048.28235.3931
        \multirow{2}{*}{}\multirow[b]{2}{*}{Chi-Squared Contributions}Site of injury
        Shoulder/ ArmHand/ FingersThigh/ LegKneeAnkleFootOther
        \multirow{3}{*}{sodoct}Football1.9873223.03890\(10 \cdot 77575\)2.47484\(B\)9.475860.31575
        Handball4.733334.38079C0.170871.446903.806640.73331
        Basketball0.2028626.388654.104004.104000.001026.403063.93521
      2. Given that the test statistic, \(X ^ { 2 }\), is 116.16, carry out the significance test at the \(5 \%\) level.
      3. Which site of injury most affects the conclusion of this test? Comment on your answer. Rhian also analyses the data on the type of contact that caused the injuries and the sport in which they occur, shown in the table below.
        Observed valuesBallOpponentSurfaceNoneTotal
        Football17681792194
        Handball23341938114
        Basketball2817121471
        Total6811948144379
        The chi-squared test statistic is 46.0937 . Rhian notes that this value is smaller than 116.16 , the test statistic in part (b). She concludes that there is weaker evidence for association in this case than there was in part (b).
      4. State Rhian's misconception and explain what she should consider instead. \section*{END OF PAPER}
      WJEC Further Unit 2 2022 June Q6
      11 marks Standard +0.3
      6. An online survey on the use of social media asked the following question: \begin{displayquote} "Do you use any form of social media?" \end{displayquote} The results for a total of 1953 respondents are shown in the table below.
      Age in years
      Use social media18-2930-4950-6465 or olderTotal
      Yes3104123481961266
      No42116196333687
      Total3525285445291953
      To test whether there is a relationship between social media use and age, a significance test is carried out at the \(5 \%\) level.
      1. State the null and alternative hypotheses.
      2. Show how the expected frequency \(228 \cdot 18\) is calculated in the table below.
        Expected valuesAge in years
        Use social media18-2930-4950-6465 or older
        Yes\(228 \cdot 18\)\(342 \cdot 27\)352.64342.92
        No123.82185.73191.36186.08
      3. Determine the value of \(s\) in the table below.
        Chi-squared contributionsAge in years
        Use social media18-2930-4950-6465 or older
        Yes29.34\(s\)0.0662.94
        No54.0726-180.11115.99
      4. Complete the significance test, showing all your working.
      5. A student, analysing these data on a spreadsheet, obtains the following output. \includegraphics[max width=\textwidth, alt={}, center]{77fd7ad7-f5a3-4947-afc6-e5ef45bef7a8-5_202_1271_445_415} Explain why the student must have made an error in calculating the \(p\)-value.
      WJEC Further Unit 2 2024 June Q5
      12 marks Moderate -0.5
      5. Lily is interested in the relationship between the way in which students learned Welsh and their attitude towards the Welsh language. Students were categorised as having learned Welsh in one of three ways:
      • from one Welsh-speaking parent/carer at home,
      • from two Welsh-speaking parents/carers at home,
      • at school only, for those with no Welsh-speaking parents/carers at home.
      The students were asked to rate their attitude towards the Welsh language from 'Very negative' to 'Very positive'. The following data for a random sample of 253 students were collected as part of a project.
      Learned Welsh
      AttitudeFrom two parents/carersFrom one parent/carerAt school onlyTotal
      Very negative2143046
      Slightly negative4202145
      Neutral1217837
      Slightly positive21191151
      Very positive25212874
      Total649198253
      Lily intends to carry out a chi-squared test for independence at the \(5 \%\) level. She produces the following tables which are incomplete.
      Expected FrequenciesLearned Welsh
      AttitudeFrom two parents/carersFrom one parent/carerAt school only
      Very negative11.6416.5517.82
      Slightly negative11.3816.1917.43
      Neutral9.3613.3114.33
      Slightly positive12.9018.3419.75
      Very positiveF26.6228.66
      Chi-Squared ContributionsLearned Welsh
      AttitudeFrom two parents/carersFrom one parent/carerAt school only
      Very negative7.980.398.33
      Slightly negative\(4 \cdot 79\)0.900.73
      Neutral\(0 \cdot 74\)1.02G
      Slightly positive5.080.023.88
      Very positive2.111.190.02
      Total20.703.52H
      1. Calculate the values of \(F , G\) and \(H\).
      2. Carry out Lily's chi-squared test for independence at the \(5 \%\) level.
      3. By referring to the figures in the tables on pages 16 and 17, give two comments on the relationship between the way students learned Welsh and their attitude towards the Welsh language.
      AQA Further Paper 3 Statistics Specimen Q5
      8 marks Standard +0.3
      5 Students at a science department of a university are offered the opportunity to study an optional language module, either German or Mandarin, during their second year of study. From a sample of 50 students who opted to study a language module, 31 were female. Of those who opted to study Mandarin, 8 were female and 12 were male. Test, using the \(5 \%\) level of significance, whether choice of language is independent of gender. The sample of students may be regarded as random.
      [0pt] [8 marks] Turn over for the next question
      Edexcel FS1 AS 2018 June Q4
      7 marks Standard +0.3
      1. Abram carried out a survey of two treatments for a plant fungus. The contingency table below shows the results of a survey of a random sample of 125 plants with the fungus.
      \multirow{2}{*}{}Treatment
      No actionPlant sprayed oncePlant sprayed every day
      \multirow{3}{*}{Outcome}Plant died within a month151625
      Plant survived for 1-6 months82510
      Plant survived beyond 6 months7145
      Abram calculates expected frequencies to carry out a suitable test. Seven of these are given in the partly-completed table below.
      \multirow{2}{*}{}Treatment
      No actionPlant sprayed oncePlant sprayed every day
      \multirow{3}{*}{Outcome}Plant died within a month17.92
      Plant survived for 1-6 months10.3218.9213.76
      Plant survived beyond 6 months6.2411.448.32
      The value of \(\sum \frac { ( O - E ) ^ { 2 } } { E }\) for the 7 given values is 8.29
      Test at the \(2.5 \%\) level of significance, whether or not there is an association between the treatment of the plants and their survival. State your hypotheses and conclusion clearly.
      Edexcel FS1 AS 2019 June Q1
      6 marks Standard +0.3
      1. A leisure club offers a choice of one of three activities to its 150 members on a Tuesday evening. The manager believes that there may be an association between the choice of activity and the age of the member and collected the following data.
      \backslashbox{Age \(\boldsymbol { a }\) years}{Activity}BadmintonBowlsSnooker
      \(a < 20\)933
      \(20 \leqslant a < 40\)101014
      \(40 \leqslant a < 50\)16155
      \(50 \leqslant a < 60\)151311
      \(a \geqslant 60\)4193
      1. Write down suitable hypotheses for a test of the manager's belief. The manager calculated expected frequencies to use in the test.
      2. Calculate the expected frequency of members aged 60 or over who choose snooker, used by the manager.
      3. Explain why there are 6 degrees of freedom used in this test. The test statistic used to test the manager's belief is 19.583
      4. Using a 5\% level of significance, complete the test of the manager's belief.